How to Measure Website and App Quality Objectively
In the last few months, I had several chats with early-stage and Series A companies looking to build web and mobile applications. After the first few calls, a pattern started emerging. Most of them had already signed up with a service company and were unhappy with their services. The main reason, apart from the massive cost, is product quality.
Both mobile and web applications were full of bugs, had performance janks, and crashed frequently, especially when paid marketing campaigns were running on them. This led to a loss of many marketing dollars as well.
The bigger problem was that there was no accountability. Whenever they reported bugs to these service companies, they could not replicate or figure out the problem, leading to even more frustration. I had the opportunity to conduct quick code reviews on several codebases, and I observed some familiar issues, such as poor system architecture, low code quality, and a lack of documentation. However, the most surprising finding was the absence of any observability setup.
Please note - tech product quality is objectively measurable!
Many of these founders were unaware of how to establish accountability because they did not understand that they could implement something called observability from the start.
Evaluate quality from day one!
Before discussing what observability is, I want to discuss why you should do it—why you should think about it from Day one and why you should even bother to continue reading this blog post. The following are the main reasons to make your systems observable -
1. Reduce the time it takes to figure out and fix bugs — An observable system (an app, website, backend or infra with observability in place) gives you detailed insight into the what, why, and how of any bug, crash, or other issue. It allows your tech team to fix it faster and ensure the problem never returns. If it comes back, it also alerts you so you can objectively see if your service partner is actually fixing the underlying cause or just putting tape on the leaking pipe.
2. Measure the performance of your product objectively - You will gain in-depth insights into your app's performance, detailing how quickly or slowly it responds for your end users. This includes relevant metrics such as P99 and P95 latency, database latency, crash rates, and the number of errors and exceptions. Additionally, the system provides filter dimensions, allowing you to identify which users, locations, or cohorts are experiencing these performance issues. This enables you to reach out to them and check whether the problem still persists.
3. To set benchmarks for future decisions - When you have a solid observability setup, you can establish a performance baseline and monitor it to ensure your system continues to perform well as user traffic increases. Many issues with poor code and architecture become evident only when real users begin interacting with the system at scale. This is when it's essential to have an objective way to identify problems so that you can start addressing them.
4. To save cost —The only people satisfied with their cloud expenditures are those who have effective observability setups. With real-time visibility, you can identify areas of overprovisioning and ensure that you're not spending more than a penny necessary on your infrastructure costs. This is one of the most underrated benefits of having an observable system.
In short, the only way you can sleep peacefully at night and be absolutely sure that your users are not cursing you for your terrible product quality is to implement observability from day one.
So, now that you know why you should do observability from day one, let's quickly understand what it is from a non-technical perspective.
Explain observability like I am 10 year old
There are three pillars of Observability - Logs, Metrics and Traces. This part of my blog was easy, thanks to Gen AI. Because I asked AI to explain observability like I am 10 years old, and it came up with this amazing explanation -
To understand logs, metrics, and traces, let's think of them as different ways to keep track of a busy playground.
Logs are like a diary that tells you everything that happens in the playground. Imagine if every time a kid swings, slides, or falls, someone writes it down. These notes include details like what happened when it happened, and sometimes why it happened. If a kid gets hurt or something unusual happens, you can look back at the diary to see what led to that moment. Logs help us understand specific events and can tell us about problems when they occur.
Metrics are like counting how many kids are playing on the swings at different times. For example, you might count how many kids are on the swings every hour. This gives you numbers that show how popular the swings are and if they're getting too crowded. Metrics help us measure things over time and see trends, like if more kids are coming to play as the weather gets warmer.
Traces are like following one kid on their journey through the playground. If you want to see how long it takes for a kid to go from the swings to the slide and then to the sandbox, you can trace their steps. It helps us understand how different parts of the playground connect and if there are any delays or problems along the way. Traces give us a complete picture of how activities flow together.
How They Work Together
- Logs tell us what happened at specific moments.
- Metrics show us how things are going overall by providing numbers.
- Traces help us see the path someone takes through all these activities.
When we look at all three together, we can better understand what's happening in our playground. If we notice that more kids are getting hurt (logs), we might check if there are too many kids on the swings (metrics) and see if there's a problem with how they move from one area to another (traces). This way, we can keep the playground safe and fun for everyone!
If you want to understand the technical definitions of these terms, this concise primer from the Open Telemetry team will help.
So, why don't service companies do observability from day one?
It's a reasonable question. Anyone, whether technical or non-technical, reading this post up to this point is likely to ask the same thing. There are several reasons behind it, though I don't have extensive data or research to create a pie chart illustrating the percentage contributions of each reason to this behavior. However, I have over a decade of experience building tech products at different scale, along with anecdotal evidence and insights gained from my experiences. Some reasons I have observed are -
- People know their code and system are not upto the mark, and using observability from day one will expose how crappy their work is. If you are stuck with any such person/company, please schedule a call with us, and we will get you out of that bad relationship.
- Implementing observability was tricky until a few years ago, as there was a chance of vendor lock-in. But now, OpenTelemetry has solved that problem. It's adopted and supported by every major observability company in the world. Maybe your service partner hasn't heard of that or doesn't have the technical expertise required to do this.
- Cost concerns - If you google observability solutions, you will likely come across the top players like NewRelic, Datadog, etc. And that's not because they have good products, but because they have tons of marketing dollars and hundreds of employees to ensure they come on top. And the reason they have so much money is because they have quite literally built their business by overcharging enterprises for their observability needs. Luckily, there are now tons of good open-source frameworks, such as Uptrace and Signoz, which offer the same functionalities as these top players at a fraction of the cost. Since they are open source, you can self-host and have better control over your observability costs.
There is no logical reason not to implement observability from day one. Implementing observability today requires just 10 additional lines of code in your applications and a few five-minute changes to your infrastructure.
So, how do you objectively say your tech product is great?
As you must have guessed by now, we start with observability in our solutions from day one. It gives us the final product with the intended quality and creates the transparency and accountability our clients expect from us. For us, the following are the characteristics of a good tech product -
1. App launch time (or website loading time) should be as fast as possible. Less than a second is the benchmark here.
2. API response time should be as low as possible, specifically P99 and P95 latency. Less than 150 milliseconds is a good number to start.
3. Database query and write times should be as low as possible. Read latency should be less than ten milliseconds. (For a single query).
4. The crash-free rate should be as high as possible. The last three platforms I built had a 100% crash-free rate (yes, its possible) at a sufficiently high scale. But anything above 98% is a good number here.
5. The Number of bugs users encounter per month should be a single digit (less than 10). Bugs are inevitable, especially if you are developing rapidly. However, the Number can be kept in single digits if you are careful enough.
6. The app/website/platform should have the same performance at any scale for any feature.
7. The cost of the entire tech product should be as low as possible. If your infrastructure is over-provisioned, you may get a short time gain in performance, but you are setting yourself up for failure when the actual scale hits. Also, you are just hurting your bottom line.
At O(Log n) Labs, these are not just promises; they are tenants we live by. The quality of our end product is a matter of pride for us. When we build your products, your product and technology will be one of your business's biggest tactical advantages.
Schedule a call with us today and tell us how we can help you in your building journey.