Serverless is More

I have a little hobby project I have worked on (when I have had the time) for almost a year. I am pretty confident I will never complete it, mostly because the whole idea is to have something to work on to try out new stuff.

Up until recently, it was a web app developed with Javascript and ReactJS, that part of it is still there but on the server side I had a REST API which was completely in python and with a mySQL for storage.

There is really nothing wrong with that (except for postgres being a slightly better alternative than mysql) and python is great. But it felt a bit like 2015, so I decided that I am going to rewrite the entire API to something more modern.

In today’s world, you want to develop things which can be scaled up without having to wait for a developer to do the necessary work to keep multiple instances in sync, accepting the same login tokens no matter which node you end up in and last, but least, manage and monitor the many instances (no matter if they are physical/virtual machines, or docker containers in a managed environment).

This is where “Serverless” comes to play.

Serverless?

The foundation of “serverless” is usually docker or some similar container platform. In the rest of this text I will be using mostly AWS lingo, but Google Cloud Platform (GCP) and Azure has similar solutions

Contrary to what it sounds like, Serverless is not at all “without servers”. The big thing is FaaS (Functions as a Service) where you develop each API by creating an AWS Lambda (cloud function in GCP). You will then connect it to the API Gateway which knows everything about your API because you have imported its Swagger/OAS3 spec. It knows it so well so if the input sent to it does not match the spec, it can actually validate it and send the user an error code in response. That saves you a lot of code because you no longer neede to check that required items are there, you can focus on the values provided.

Typically, you will write your function in Node (javascript) and it is stateless like a REST API should be. The function and the API gateway stuff is then deployed with a cloud provider which charges for usage, not for the server capacity like in a traditional environment where you pay to have your machines running and cost as much if you have one user (you) or 100000 users. It is up to the cloud provider to spin up and down servers as needed and start your lambda at as many of them as needed.

The whole thing is, simply put, ingenious! When someone like me has a little hobby project just because it is fun to code and interesting to learn how it works, he doesn’t have to pay for a lot of servers to run databases web servers, etc. on. In fact, I can probably manage using the Free Tier Amazon provides (super clever thing to get people to like them). But, when you create something which should be able to scale from almost no users to an unknown amount of users (we are all trying to achieve world domination, right?), it lets you do that without having to stop sign-ups because you need to sort out your capacity issues before letting more in. The worst that can happen is that your bills to the cloud provider increase, but with that many users I am sure it is not going to be a problem to create a revenue stream to match that (and if it is, maybe you should focus on something else and move on).

A small side effect of the model is that it forces you to write things that someone else can deploy in a way to make things scale. That’s a good idea, always.

What I wrote above makes it sound like it will just automatically scale but that is of course not the whole truth. If you have a relational database for storage, you still need to consider database design, using the right index, how you query and a lot of other things to not create a bottle-neck there. But once you started thinking serverless, you probably don’t want to deploy always-on database servers. Instead, you are likely to start looking at serverless noSQL offerings like DynamoDB.

That way, when I moved my python based backend to AWS Lambdas and API Gateway, it was a no-brainer to not only use a noSQL database but actually fully replace the database with DynamoDB. It is not just a matter of ripping and replacing. The whole structure will change and you will have fewer tables containing more data. It is an incredible change for someone who grew up and was schooled in a relational database-world where everything would be normalized to NF3.

But, aren’t you locking yourself into one single vendor?

Yes, sort of… There are good solutions out there to code local and be able to deploy to several different vendors (like Azure, AWS and GCP) but as soon as you start using vendor specific solutions for databases, or queues to send messages you start locking yourself in. I have never liked it (it is the main reason I have always avoided Stored Procedures in SQL databases), but in this case the value added by “going with Amazon” (or whoever you chose) makes it worth it. Should their price list increase dramatically (in the true meaning of the word, not just “double”) I will consider rewriting stuff and deploy elsewhere but the fact that it works, scales and with a reasonable price tag is good enough for me.