Once you've built and deployed an API, and an "API consumer" is using it, changing it becomes very difficult. Maybe that API consumer is you, so you can just change whatever you want. Maybe it's another team at your organization, which is also not too bad, but you've got to schedule the changes with them, and they might be busy. Maybe it's a big customer who paid you lots of money and isn't excited about having to do extra work. Maybe there's loads of all sorts of customers...In such scenarios, implementing effective API versioning is crucial to manage updates and transitions seamlessly.
Change is Unavoidable
Has you ever released any software then thought: "That's perfect, no change needed."
Nobody would believe you if you said yes, because requirements change, domain models evolve, customers give feedback, and functionality gets deprecated.
Some software is easy to version. If you're releasing a new NPM module, Docker image, or other downloadable software, then you normally just tag/publish a new version with number like 1.0.1
. With installable/downloadable software, users can have whichever version. Common conventions, including [Semantic Versioning](https://semver.org/), use three numbers: major, minor, and patch, so some users might be running 1.0.0
whilst others run 1.0.2
and eventually some may be on 2.1.3
. Some users might even multiple versions at the same time!
How do you version an API, though?
How does versioning work?
An API version is any sort of number or date that an API consumer can pass to the API that lets it know what contract to use for the API interactions.
- URL Versioning**
places the version number, usually just as a major version, and looks a bit like this: https://acme.com/api/v1/<resource>
.
- Media Type Versioning**
has the version in the HTTP Accept header, and follows this sort of format: Accept: application/vnd.acme.v2+json
, or potentially using a parameter like this:application/vnd.acme+json; version=2
.
These would both be "global" versions, where the whole API could be massively different between v1 and v2. New versions of an API might have new functionality added, existing functionality changed, or some parts might be removed entirely. Whole resources could change, or just some of the properties within them.
Adding new resources is completely harmless. Adding new properties is usually fine (so long as they're not massively slowing things down unexpectedly), but renaming, removing, or changing anything would break backwards compatibility (meaning existing customers would be broken). That usually leads to lots of angry phone calls, emails, and some user metrics going down and to the right.
Versioning seems like the right move, so let's stick a v1 in the URL and then create all of our controllers in a v1 subfolder like app/controllers/v1/SomeResource.php
, right?
Probably not the best call. If you try and jam multiple major versions into the same API codebase, then at some point, no matter how hard you serialize your output, contract test your responses, and integration test your controllers, it's incredibly difficult to avoid sharing code between v1 and v2. It's easy to fall into. When you make the v2 API some functionality will be different and some will be the same, then you tweak v2 and you've broken v1... defeating the point of going through all this.
Instead of thinking about "API" having sub-sections of "v1" and "v2", it's safer to think of "API v1" and "API v2" as two totally different entities. Getting this right from the start will help avoid accidents, and means things are a whole lot less annoying if v2 ends up being written in a different language, or deployed somewhere else. If you get smart with your git usage you can even merge bug fixes upwards, instead of trying to make them twice in two different subfolders.
Global versioning seems like the sensible route, and it's the most popular, but it pushes a lot of extra work onto API consumers who have to try and keep up with these new versions.
Global Versioning Wastes Customers Time
Some APIs have kept their v1 API going for over a decade, which suggests they probably didn't need API versioning in the first place.
Some APIs I've worked with are on v14, and knowing the backstory, that was because the API developers didn't ever reach out to any stakeholders to ask what they needed out of an API and just wrote loads of code and rewrote it every time a new consumer came along.
Doing more upfront "API Design" can cut out the need for the first few versions you're likely to roll through, as many of those come from not getting enough user/market research done early on. This is common in startups that are moving fast and breaking things, but it can happen in any size business.
Using OpenAPI, you can plan how the API will work and generate documentation and mock APIs that will let potential API consumers play around with the API before you waste any time building it.
People think no "potential consumers" will want to do that, but I've done it with banks, airlines, etc. and their consumers, and they're happy to get involved because they know that getting the API right first means they won't have to rebuild their integrations later.
Here's the common workflow I see for new versions being released: The API development team make their changes, creates the new major version, tests it, sets up the new instance and production pipeline, deploys it, test it some more, documents the change, alert all their customers, support gets hammered with requests for help, you have to maintain multiple versions for months whilst customers slowly change over, some don't. Eventually, you cut them off, and only lose a little business...
Well, okay. That just sounds like doing business. You need to make changes sometimes, and versions let you do that. Some of the trouble is lack of self-control. Breaking changes build up in the backlog like water behind a dam, and releasing all those breaking changes with a new major version is like blowing the dam with dynamite.
All the tech debt, all the "oh lets use that new JSON format", all the "I hate the name of that, let's change it to this" comes out, all at once. The API developers are happy, and hey, it's a new major version, anything goes, but... how much of that adds any real value to your customers?
Each of those customers has to fit time into their existing roadmaps to see what changes have been made, pour over any documentation that's been written, fundamentally rewrite parts of their codebase, and test their whole platform to make sure there are no unexpected changes rippling through.
Clients have to make all these changes, upgrade their dependencies, work with a new JSON format which doesn't have any good tooling in their language, and eventually they finally hack and workaround their way to keeping their existing functionality working, and after spending hours doing all of that there might not be any changes relevant to them. It turns out the v2 API was mostly about breaking /foo into /bar and /baz but they didn't use /foo, so... why did they have to do all that?
If your API has 10 customers, and you release a new major version, that's ten customers going through all that work. If it's two days of work to upgrade to the latest API, then that's 160 person-hours spread across your customers. If you have 1,000 customers, that's 1,000 customers, that's 16,000 person-hours. All of this work is being done for either no reason or for minimal gain. Maybe some of them are excited about the new changes, but that means many of them are not.
Evolution, the Alternative to Versioning
API Evolution is something I've been happily using for over a decade on most APIs I've worked on. It's seen a resurgence of popularity thanks to gRPC and GraphQL adding first-class support for the concept in all its tooling, and that's making REST author Roy Fielding happy as he's been complaining about API versioning for decades.
API evolution is the concept of striving to maintain the "I" in API, the request/response body, query parameters, general functionality, etc., only breaking them when you absolutely, absolutely have to. It's the idea that API developers bend over backwards to maintain a contract, no matter how annoying
that might be. It's often more financially and logistically viable than dumping the workload onto a wide array of consumers.
For those working with Python, mastering this delicate balance is key, and our comprehensive guide on API versioning in Python provides practical insights into achieving this with your codebase.
Let's look at a few examples:
> The property name exists, and that needs to be split into first_name and last_name.
A minor example, but a breaking change nonetheless. It doesn't have to be though, we can be smart about it. Our serializers (where we turn raw models into JSON objects we're happy to share with consumers) we can add dynamic properties. The internal database might have changed to first_name and last_name, but we can pop that old name property right back in the JSON like this:
class UserSerializer
include FastJsonapi::ObjectSerializer
attributes :name, :first_name, :last_name
"#{object.first_name} #{object.last_name}"
end
end
When folks POST
or PATCH
to your API, if they send a name
you can convert it, (explode on space) or if they send first_name
and last_name
it'll get picked up fine. If anyone complains that's not how names work, tell them to upgrade their client to use first_name
and last_name
in their interface.
That's adding a property, but what about something massive? A fundamental change to how the entire company and therefore a total rewrite of the domain models?
Easy!
I've got a Tree Tracker API, which lets me photograph trees my charity Protect Earth has planted with a special iOS application. The tree photos are then uploaded with species, coordinates, and other metadata to our API, and then companies like Ecologi can pay us for those photos to pop onto their users profiles.
There's a /trees
resource, and orders have plantedTrees
property, but now my charity has outgrown just planting trees. Now we're sowing wildflower meadows, rewetting peat bogs, and planting hedgerows.
I could have gone around adding /peat
, and /meadow
, etc but that was asking for trouble with adding new biodiversity units in the future, so I figured "alright, lets ditch trees, and go with something new." New version? Nah, new resource!
We've added /units
, which are pretty similar but the validation rules have changed. The species
property is not required here, and a new required type
property has been added, which can be tree
, peat
, wildflower
, etc. These units now also show up on the orders, with plantedTrees
still being there, but also allocatedUnits
is right next to it.
Removing /trees
or plantedTrees
would break customers, and that would be bad because we've got huge bills to pay for all the trees we bought and planted. I don't want to fire anyone or stop planting trees, so let's keep the old contract working.
Now /trees
is still working exactly the same as before, but it calls the Unit model with WHERE type="tree"
added as a criteria.
The controller has a Sunset
header, because that's a handy new way to communicate that the endpoint is going away, and is one of many ways I can communicate with consumers that they should ditch the endpoint at some point.
Sunset: Wed, 31 Jan 2023 23:59:59 GMT
Still, having that extra controller isn't hurting anyone, and once I've spotted on Trebble that nobody is using it, I can just delete it.
We've also got the plantedTrees: []
property, which still works just fine. We've tweaked that to pull only units with type=tree, and its using the same serialization logic as before. Next to it is allocatedUnits, which was a chance for us to switch to a better name too, because we built some fancy "allocator" logic which helps us assign trees and other units to all sorts of customers who might be funding it in a myriad of different ways.
The biggest downside here is if you've got 1000 trees in an order, you're gonna have 1,000 plantedTrees
and 1,000 allocatedUnits
. Thankfully, seeing as these models are so similar, I'll be getting on with communicating these changes to the clients who I know use it so it can be removed.
Summary
Versioning can help you feel more confident making changes in the future, but it's pushing the work onto your API consumers, who are paying you for a stable API.
It's your job to make their lives easy. That doesn't mean never changing anything. It means not breaking the contract unless you really really have to, then finding the most minimal way to do that when you do.
Trebble can help you keep an eye on which endpoints are being used, and which aren't, so if you're only ever adding things and slowly deprecating old things you'll be fine. Maybe you've been calling it /api/v1
when you meant /api/beta
all this time. Maybe you could just do that.
Give it a try. Keep the /v1
off your next API. You can always slap a /v2
in there if you really need to.