API Monitoring with Treblle

An overview of API monitoring & observability with Treblle. How to set it up and what can you get out of it?

a year ago   •   14 min read

By Davor Kolenc
Table of contents

Engineers and DevOps, developers, and even product managers might be interested in some kind of API metrics. Different teams have different requirements.

Usually, DevOps needs some fast and easy info like uptime and latency. An engineer in an API as a service team needs way more observability and can benefit from data like requests per minute, real-time monitoring and thus seeing spikes in real-time, seeing errors in real-time and so on.

Product Managers usually want some data about how the product is being used and see data like what endpoints are being called, and what data is being shown. Recreating the customer's journey is of great value to them. Being able to test things in a fast way also helps.



Now, there are tools that make this all happen. Some are easier to understand, some a bit harder. Usually, you’ll need something like DataDog, maybe a bit of New Relic, AWS has some options, and maybe you even endeavor to implement Grafana over the course of a few weeks, and after all that you’ll probably still need something else for observability. If you want to catch bugs, than BugSnag and similar services are the ones you'll be using.  

Don’t forget about the docs! So you can put Swagger or Redocly in the mix there. Oh, and testing. That’s usually done in Postman.

What is that, 6, 7 tools? More? Sounds about right. That’s the average. 🤯

Of course, you could get rid of 2 or 3 by taking on some of the work yourself. Go into DIY mode and keep track of all the logs yourself. Manually!

Or, you know, hire a person (and as you grow) a team that will log all of it, continuously refresh docs, and drink a lot of coffee to stay awake. You need to know as soon as anything goes wrong, right!? Buy a good coffee machine for these guys and a RedBull fridge (it's probably a good idea to fill it every 24 hours).

Being Sherlock Holmes Might be Overrated

Is there a better way though?

Do you really like piecing together data like Sherlock Holmes? Being a super genius detective that is always the smartest person in the room sounds like a lot of fun.

Realistically, it is no fun at all. Especially when you have 5+ tools and neither of them can give you the specific data you are looking for. You end up not being Holmes at all. Who even remembers Lestrade? That’s the not-so-bright cop Holmes is helping out a lot. If we are being honest, in those moments you are much closer to Lestrade than you are to Holmes.

We don’t want that for you, so we devised a thing and launched it about a year and a half ago. As you may have gathered it’s called Treblle.

It is a lightweight SDK that you put on top of your API. Hearing that, some of you are thinking “Ah, that’s cool and all, but adding an SDK to my API is going to have an impact on performance and latency. It has to.”

It’s a healthy thing to assume, but we built it so that it actually doesn’t and is basically next to invisible. How?
First off, we use AWS, and our intake pipeline looks like this.


Treblle takes the RAW data from the SDKs and stores it into an S3 bucket.

The data goes through an ETL process. It takes the RAW data from the file, transforms it, enriches it, and stores it into a database.

These 2 are microservices by the way since both do 1 thing from start to finish. Here is the whole process of making things into microservices for Treblle.

When we started off, the response time it took for you to see the “real-time” data was around 500ms. So, pretty decent. But we wanted it to be faster.

Switching from MySQL to SingleStore did the trick. Response time is now around 90ms on average. 🎉

So there, you are safe from performance impact. And you don't have to be Sherlock to get mission critical data at a glance. No need for noticing details if the "clues" are right in front of you! What exactly do you get to see with the Treblle "magnifying glass"? Read on!



First, lets talk about integration.

Extreme Weirdos Give an Average of Around 5 Minutes to Integrate

So the first thing is integration. We made it super simple. All it takes is adding a few lines of code to your API. All sensitive data (like passwords) will already be masked, but if you want to mask more, you can do it by simply adding those commands into the SDK.

What languages and frameworks are supported by Treblle?
We aim to have virtually everything. We currently support 17 of them (and counting). So all the major ones are already in there, Fastify (a JS framework) being the most recent addition.

We are not too proud to accept help in developing and maintaining SDKs, so feel free to join us on our community or GitHub.

This shouldn’t take you more than 5 minutes. Some weirdo (we mean this in a positive way because we are weirdos as well) did it in 39 seconds. 🤠 On the other side of the spectrum, we have some users that make a Treblle account and add the SDK in 6 months. This doesn't mean it takes 6 months to integrate. It's simply because some users want to try us out when they have the time. Well, at least they didn't forget about us. 😅

So yeah, there are extremes going both ways. So 5 minutes is what we call the Weirdo Average of Time to Integrate (WATI).

However long it takes you, the thing you see first is your dashboard with some overview data for all the projects you have. Oh, yeah, there’s no limit to the projects. The thing we take as a limiter is the number of API requests we process for you. On the free plan you get 250K per month (forever, and no, no credit cards required).


This is, let's say the first level of Treblle. A general overview of the projects. You get around 15 data points just from here. So, how many problems are there, how many API requests were made, RPM, number of Endpoints, something weird called Treblle API Score (more on that later) and so on.

There’s also a thing called Filters. What it is essentially is, well, its sorcery 🧙‍♂️ that you can use to find users, requests, endpoints, errors and whatever it is you are looking for within your API. You can literally type in a user, or something from the parameters of the request and get all the data you need. Did a user have problems logging into his account? Find that exact request! Retrace his steps and see what happened.

Let’s jump to the second level. This is where the real magic happens for some of you guys.

The Wonderland Level of Treblle

Did you jump through the hole yet? If you're there, welcome to the wonderland level of Treblle!


Feel free to play around and get the feel of the level. It is the dashboard for your project. Most of our users spend the most of their time here! Every project gets one dashboard and this is where you can see all the API requests in real time. All of them are enriched with actionable data.

Here is how it looks in the requests tab.


So far, after adding the SDK, all you needed to do was 3 clicks!
Here’s a cool little thing you can do for non-technical staff that needs to know things. Do you see a little thing called Alias?


As a developer, you probably don’t need that, but by using them you can simply point out what kind of request that was (someone favorited an article for example). It can be any number of things you can make aliases for and help you non-techies understand how your API works.

The dashboard tab gives you a lot of data about everything that is happening with your API. If you scroll down the dashboard there's a great way to get some analytical data for all that are interested in those kinds of metrics.  It's an analytics dashboard that you can customize to your own liking with prebuilt widgets. (We are working on making more widgets and on giving you the ability to make some of your own).

So, depending on your needs chose the one you need and move them around. It's all done with a simple click. There can be multiple members on a project (let's say a team of 20). Every single one of them can customize this and have the view that they want.

There’s one more thing we want to point out here. Remember we talked about you having to do a lot of the things manually? One of those things some tool might ask you to do is manually entering your Endpoints so that they could be tracked.

Doesn't sound so hard when you have 10 endpoints. But, what if you have hundreds or more? Wouldn’t it be great if you didn’t have to manually input all of the endpoints that you want to track?



Well, you don’t. As we said, the setup really does take you 5 minutes. We auto-detect your endpoints as the requests come in.

This also ties in into the documentation tab. Treble auto-generates documentation. So, no matter the size of the change you made, it’s going to be collected and your developer portal is going to get updated.

Remember worrying about how much work there is with writing docs, and then a change happens and you have to do it all over again? You can forget about that. Treblle solves this and you don’t have to worry about it ever again.

Do you have to onboard a large number of new developers? Doesn’t having up-to-date docs at all times help with that?

So, you see, it’s not just merely monitoring that Treblle provides. We used the monitoring feature to make your APIOps smooth so shipping and debugging should come as close as it can to breaking the light-speed barrier.


TREBLLE API SCORE - CAN YOU RELY ON IT?

What about API score? What nonsense did we come up with here?
Well, it turns out this is one of our most popular features. It’s a great way of signaling on how your API is doing. Is it healthy, is it fast and does it have problems?

We simply take all the best practices and follow them for you. There are 3 key categories we follow:

- Performance
- Security
- Quality

Each of these follows best practices in the industry to determine your score.

PERFORMANCE

Content encoding

Using content encoding on your API can reduce the payload size of your responses by 70 to 95 percent. That means you save money on bandwidth and given you're returning a much more compressed and optimized payload it also means things load much faster for your end use.

The implementation is dead simple.

Return a Content-Encoding header with the value of gzip (the most common and widespread compression algorithm) and you’re done!

More information can be found on this link.

HTTP2/3 support

HTTP2 comes with many performance-originated features that enable your server to serve faster API responses. Things like Multiplexing, Server push, Header compression and others.

HTTP3 is even faster and more powerful mainly because it’s using a new networking protocol called QUIC versus the good old TCP.

You should upgrade your HTTP version to the latest one available or use cloud providers like Cloudflare or many of the API gateways that support the newest version of HTTP out of the box.

Avg. response size

Average response size represents the amount of data you return on your API endpoints.

As you might imagine the less data you return the faster your API is. You should aim to keep your response below 100KB if possible.

The fastest way to optimize your response sizes is to use pagination when returning large quantities of data, to optimize your response objects so they contain only the data that is need and to use compression whenever possible.

Avg. load time

The load time of your API is the most important performance metric on your API. It directly impacts the end user and the slower your API is the poorer the experience for the end user.

In most cases the number one cause of longer load times are database related problems and optimizations. So make sure you are optimizing your queries and returning only the data you need.

You should aim to have an average load time below 150ms for a good experience and even lower for an amazing experience.

SECURITY

HTTPs support

HTTPs uses an encryption protocol called TLS to encrypt communication between your end-users and your API. This means that simply by using HTTPs you are adding an additional layer of security to your API that protects your users and their data.

HTTPs is the de-facto standard and you should definitely be using HTTPs. You can get an SSL certificate for free by using any of the major cloud providers or services like Cloudflare and Let’s encrypt.

Authentication

Authentication is the simplest and most important thing you should do on your API. It allows you to secure your API, know who is accessing it and when and more importantly easily revoke access to anyone who is behaving as a bad actor.

There are many different types of authentication on your API but the simplest and most common one is called: Bearer Authentication.

To use Bearer authentication you should send a header of Authorization with the value of Bearer followed by a unique key that is used to identify that user.

QUALITY

JSON response

Properly labeling your API is quite important because it instructs clients who are reading data from your API how to behave.

When using JSON based REST APIs you should always return the proper headers to make sure that your content is always returned and requested in JSON format. To do that simply return a header of Content-Type with the value of: application/json.

Version support

Versioning your API allows you to make changes to your endpoints, responses, objects and structure without impacting current clients and users.

The sooner you start using versioning on your API the less problems you’ll have in the future.

The simplest way of versioning is to use URL based versioning which implies you will have URLs like domain/api/v1/auth/login or domain/api/v2/auth/login. You can easily differentiate between different versions of your API and use any of them at the same time.

Error ratio

One of the most important things about your API is to not have errors and problems on your API.

Not only does it degrade the user experience but also requires entire teams of developers to try to understand what is going on, and who needs to fix the problem, and deploy the changes to your API.

Treblle tracks code-based errors and alerts your team when they happen - so the only thing you need to do is make sure you fix them.

That’s not all in the Wonderland section of Treblle. Remember, we are just on the second level. We are seeing your project. There is the flows tab you can use to group API requests in some way.

CATCHING BUGS BETTER THAN ELMER

There is also of course the problems tab, where we show you all the 3xxx, 4xx and 5xx which you can close as you solve them. We even give you the line where a code error occurred.

We’ll let you play in Wonderland. But did you know we could go further down the rabbit hole? Let’s visit the Qauntamania level of your API by opening a single request.

QUANTUM-LEVEL API MONITORING


There’s a large amount of data we give you for every single API request. Let’s try and break it down for you.


Lets look at the example bellow.

From the top, we see this one is a POST request, it has a 200 (the one we want) response code and it is in our Version 1.

You then get some of the same metrics as in the previous level. Share request is something many people get excited about when we explain it, but let’s leave that for later.

You now see the request data. You get params, auth, docs and headers.

Parameters

What about response data? Yup, that is just a scroll away, right after the request data. You also get the docs and headers for the response.


After that here is some metadata about the request like where it originated from, on which device, what’s the time zone, operating system, and so on. Pretty useful when tracking down problems right?

And to top it all off, the quantum level of Treblle automatically gives you related requests you might be interested in. If you need to go into Sherlock Holmes mode, this is one of the ways Treblle helps you actually become a genius detective instead of Lestrade.

And as we mentioned, the last thing we are covering is the super popular Share Request feature. This comes in handy when dealing with providers to your API. You can see the data (they probably can’t and will ask you for it anyway). Instead of you tracking all that down, simply copy the link and send it to them. They will have the same exact view of that request as you do.

You don’t want people lingering too much? The links are auto-expiring. Current options are 1 hour, 1 day, 1 week and 1 month.

How do you want to do this?

If you are still married to having multiple tools and like piecing your data together, we get that. We don’t know what is the right time to make the switch to something way more sustainable and scalable.

We can only offer help and guidance in that process. Making Treblle easy to integrate and understand was always the primary goal and an idea we think any developer and engineer can stand behind.

Hopefully, we brought Treblle a bit closer to all of you interested in the topic. We wish you all good luck with your APIs and if you have any questions about Treblle and how it all works, feel free to reach out for a chat.

Spread the word

Keep reading