Serverless is a model where the developer doesn't need to worry about servers: configuring, maintaining, or updating is none of their business. Although it is not an entirely new concept, the services that are offered nowadays are much more powerful and enable a wider range of applications. If you want to build cost-effective and scalable solutions, you should dive deeper into this subject and understand how it works.
In this chapter, we will cover the following topics:
- What is serverless?
- The main goals of serverless
- Pros and cons
- Use cases
After this chapter, you will be ready to start our hands-on approach, building an online store demo application, one piece per chapter.
Serverless can be a model, a type of architecture, a pattern, or anything else you prefer to call it. For me, serverless is an adjective, a word that qualifies a way of thinking. It s a way to abstract how the code that you write will be executed. Thinking serverless is to not think in servers. You code, you test, you deploy, and that s (almost) enough.
Serverless is a buzzword. You still need servers to run your applications, but you should not worry about them that much. Maintaining a server is none of your business. The focus is on development and writing code, and not in the operation.
DevOps is still necessary, although with a smaller role. You need to automate the deployment and have at least a minimal monitoring of how your application is operating and how much it costs, but you don t need to start or stop machines to match the usage and neither do you need to replace failed instances or apply security patches to the operating system.
A serverless solution is entirely event-driven. Every time that a user requests some information, a trigger will notify your cloud vendor to pick your code and execute it to retrieve the answer. In contrast, a traditional solution also works to answer requests, but the code is always up and running, consuming machine resources that were reserved specifically for you, even when no one is using your system.
In a serverless architecture, it s not necessary to load the entire codebase into a running machine to process a single request. For a faster loading step, only the code that is necessary to answer the request is selected to run. This small piece of the solution is referenced as a function. So we only run functions on demand.
Although we call it simply as a function, it s usually a zipped package that contains a piece of code that runs as an entry point along with its dependencies.
In the following diagram, the serverless model is illustrated in a sequence of steps. It's an example of how a cloud provider could implement the concept, though it doesn't have to implement it in this way:
Let's understand the following steps shown in the preceding diagram:
- The user sends a request to an address handled by the cloud provider.
- Based on the message, the cloud service tries to locate which package must be used to answer the request.
- The package (or function) is selected and loaded into a Docker container.
- The container is executed and outputs an answer.
- The answer is sent to the original user.
What makes the serverless model so interesting is that you are only billed for the time that was needed to execute your function, usually measured in fractions of seconds, not hours of use. If no one is using your service, you pay nothing.
Also, if you have a sudden peak of users accessing your application, the cloud service will load different instances to handle all simultaneous requests. If one of those cloud machines fails, another one will be made available automatically, without needing to configure anything.
Serverless is often confused with Platform as a Service (PaaS). PaaS is a kind of cloud computing model that allows developers to launch applications without worrying about the infrastructure. According to this definition, they have the same objective! And they do. Serverless is like a rebranding of PaaS, or you can call it the next generation of PaaS.
The main difference between PaaS and serverless is that in PaaS you don t manage machines, but you are billed by provisioning them, even if there is no user actively browsing your website. In PaaS, your code is always running and waiting for new requests. In serverless, there is a service that is listening for requests and will trigger your code to run only when necessary. This is reflected in your bill. You will pay only for the fractions of seconds that your code was executed and the number of requests that were made to this listener. Also, serverless has an immutable state between invocations, so it's always a fresh environment for every invocation. Even if the container is reused in a subsequent call, the filesystem is renewed.
Besides PaaS, serverless is frequently compared with Infrastructure as a Service (IaaS) and On-Premises solutions to expose its differences. IaaS is another strategy to deploy cloud solutions where you hire virtual machines and is allowed to connect to them to configure everything that you need in the guest operating system. It gives you greater flexibility, but it comes with more responsibilities. You need to apply security patches, handle occasional failures, and set up new servers to handle usage peaks. Also, you pay the same per hour whether you are using 5% or 100% of the machine s CPU.
On-Premises is the traditional kind of solution where you buy the physical computers and run them inside your company. You get total flexibility and control with this approach. Hosting your own solution can be cheaper, but it happens only when your traffic usage is extremely stable. Over or under provisioning computers is so frequent that it s hard to have real gains using this approach, even more when you add the risks and costs to hire a team to manage those machines. Cloud providers may look expensive, but several detailed use cases prove that the return of investment (ROI) is larger running on the cloud than On-Premises. When using the cloud, you benefit from the economy of scale of many gigantic data centers. Running on your own exposes your business to a wide range of risks and costs that you ll never be able to anticipate.
To define a service as serverless, it must have at least the following features:
- Scale as you need: There is no under or over-provisioning
- Highly available: It is fault tolerant and always online
- Cost-efficient: You will never pay for idle servers
With IaaS, you can achieve infinite scalability with any cloud service. You just need to hire new machines as your usage grows. You can also automate the process of starting and stopping servers as your demand changes. But this is not a fast way to scale. When you start a new machine, you usually need to wait for around 5 minutes before it can be usable to process new requests. Also, as starting and stopping machines is costly, you only do this after you are certain that you need. So, your automated process will wait some minutes to confirm that your demand has changed before taking any action.
Infinite scalability is used as a way to highlight that you can usually grow without worrying if the cloud provider has enough capacity to offer. That's not always true. Each cloud provider has limitations that you must consider if you are thinking in large applications. For example, AWS limits the number of running virtual machines (IaaS) of a specific type to 20 and the number of concurrent Lambda functions (serverless) to 1,000.
IaaS is able to handle well-behaved usage changes, but it can't handle unexpected high peaks that happen after announcements or marketing campaigns. With serverless, your scalability is measured in milliseconds and not minutes. Besides being scalable, it s very fast to scale. Also, it scales per invocation without needing to provision capacity.
When you consider a high usage frequency in a scale of minutes, IaaS suffers to satisfy the needed capacity while serverless meets even higher usages in less time.
In the following graph, the left-hand side graph shows how scalability occurs with IaaS. The right-hand side graph shows how well the demand can be satisfied using a serverless solution:
With an On-Premises approach, this is a bigger problem. As the usage grows, new machines must be bought and prepared, but increasing the infrastructure requires purchase orders to be created and approved, you need to wait the new servers to arrive and you need to give time to your team to configure and test them. It can take weeks to grow, or even months if the company is very big and requests many steps and procedures to be filled in.
A highly available solution is the one that is fault tolerant to hardware failures. If one machine goes out, you must keep running the application with a satisfactory performance. If you lose an entire data center due to a power outage, you must have machines in another data center to keep the service online. Having high availability generally means to duplicate your entire infrastructure, placing each half in a different data center.
Highly available solutions are usually very expensive in IaaS and On-Premises. If you have multiple machines to handle your workload, placing them in different physical places and running a load balancing service can be enough. If one data center goes out, you keep the traffic in the remaining machines and scale to compensate. However, there are cases where you will pay extra without using those machines.
For example, if you have a huge relational database that is scaled vertically, you will end up paying for another expensive machine as a slave just to keep the availability. Even for NoSQL databases, if you set a MongoDB replica set in a consistent model, you will pay for instances that will act only as secondaries, without serving to alleviate read requests.
Instead of running idle machines, you can set them in a cold start state, meaning that the machine is prepared, but is off to reduce costs. However, if you run a website that sells products or services, you can lose customers even in small downtimes. A cold start for web servers can take a few minutes to recover, but needs several more minutes for databases.
Considering these scenarios, in serverless, you get high availability for free. The cost is already considered in what you pay to use.
Another aspect of availability is how to handle Distributed Denial of Service (DDoS) attacks. When you receive a huge load of requests in a very short time, how do you handle it? There are some tools and techniques that help mitigate the problem, for example, blacklisting IPs that go over a specific request rate, but before those tools start to work, you need to scale the solution, and it needs to scale really fast to prevent the availability from being compromised. In this, again, serverless has the best scaling speed.
It s impossible to match the traffic usage with what you have provisioned. With IaaS or On-Premises, as a rule of thumb, CPU and RAM usage must always be lower than 90% for the machine to be considered healthy, and ideally CPU should be using less than 20% of the capacity with normal traffic. In this case, you are paying for 80% of waste when the capacity is in an idle state. Paying for computer resources that you don t use is not efficient.
Many cloud vendors advertise that you just pay for what you use, but they usually offer significant discounts when you provision for 24 hours of uptime in a long term (one year or more). It means that you pay for machines that you will keep running even in very low traffic hours. Also, even if you want to shut down machines to reduce costs, you need to keep at least a minimum infrastructure 24/7 to keep your web server and databases always online. Regarding high availability, you need extra machines to add redundancy. Again, it s a waste of resources.
Another efficiency problem is related with the databases, especially relational ones. Scaling vertically is a very troublesome task, so relational databases are always provisioned considering max peaks. It means that you pay for an expensive machine when most of the time you don t need one.
In serverless, you shouldn t worry about provisioning or idle times. You should pay exactly the CPU and RAM time that is used, measured in fractions of seconds and not hours. If it s a serverless database, you need to store data permanently, so this represents a cost even if no one is using your system. However, storage is very cheap compared to CPU time. The higher cost, which is the CPU needed to run the database engine that runs queries, will be billed only by the amount of time used without considering idle times.
Running a serverless system continuously for one hour has a much higher cost than one hour in a traditional infra. However, the difference is that serverless is designed for applications with variable usage, where you will never keep one machine at 100% for one hour straight. The cost efficiency of serverless is not perceived in websites with flat traffic.
In this section, we will go through the various pros and cons associated with serverless computing.
We can list the following strengths:
- Fast scalability
- High availability
- Efficient usage of resources
- Reduced operational costs
- Focus on business, not on infrastructure
- System security is outsourced
- Continuous delivery
- Microservices friendly
- Cost model is startup friendly
Let s skip the first three benefits, since they were already covered in the previous pages, and let's take a look at the others.
As the infrastructure is fully managed by the cloud vendor, it reduces the operational costs since you don t need to worry about hardware failures, applying security patches to the operating system, or fixing network issues. It effectively means that you need to spend less sysadmin hours to keep your application running.
Also, it helps to reduce risks. If you make an investment to deploy a new service and that ends up as a failure, you don t need to worry about selling machines or disposing the data center that you have built.
Lean software development states that you must spend time in what aggregates value to the final product. In a serverless project, the focus is on business. Infrastructure is a second-class citizen.
Configuring a large infrastructure is a costly and time-consuming task. If you want to validate an idea through a Minimum Viable Product (MVP) without losing time to market, consider using serverless to save time. There are tools that automate the deployment, which we will use throughout this book and see how they help the developer to launch a prototype with minimum effort. If the idea fails, infrastructure costs are minimized since there are no payments made in advance.
The cloud vendor is responsible for managing the security of the operating system, runtime, physical access, networking, and all related technologies that enable the platform to operate. The developer still needs to handle authentication, authorization, and code vulnerabilities, but the rest is outsourced to the cloud provider. It s a positive feature if you consider that a large team of specialists are focused on implementing the best security practices, and patching new bug fixes as soon as possible to serve their hundreds of customers. That s the definition of economy of scale.
Serverless is based on breaking a big project into dozens of packages, each one represented by a top-level function that handles requests. Deploying a new version of a function means uploading a ZIP file to replace the previous one and updating the event configuration that specifies how this function can be triggered.
Executing this task manually, for dozens of functions, is an exhausting task. Automation is a must-have feature when working in a serverless project. In this book, we ll use the Serverless Framework that helps developers manage and organize solutions, making a deployment task as simple as executing a one-line command. With automation, continuous delivery is a feature that brings many benefits, such as the ability to deploy at any time, short development cycles, and easier rollbacks.
Another related benefit when the deployment is automated is the creation of different environments. You can create a new test environment, which is an exact duplicate of the development environment, using simple commands. The ability to replicate the environment is very important for building acceptance tests and to progress from deployment to production.
Microservices is a topic that will be better discussed later in this book. In short, a Microservices architecture is encouraged in a serverless project. As your functions are single units of deployment, you can have different teams working concurrently on different use cases. You can also use different programming languages in the same project and take advantage of emerging technologies or team skills.
Suppose that you have built an online store with serverless. The average user will make some requests to see a few products and a few more requests to decide whether they will buy something or not. In serverless, a single unit of code has a predictable time to execute for a given input. After collecting some data, you can predict how much a single user costs on average, and this unit cost will remain almost constant as your application grows in usage.
Knowing how much a single user costs and keeping this number fixed is very important for a startup. It helps to decide how much you need to charge for a service or earn through ads or sales to have a profit.
In a traditional infrastructure, you need to make payments in advance, and scaling your application means increasing your capacity in steps. So, calculating the unit cost of a user is a more difficult task and it s a variable number.
In the following diagram, the left-hand side shows traditional infrastructures with stepped costs and the right-hand side depicts serverless infrastructures with linear costs:
Serverless is great, but no technology is a silver bullet. You should be aware of the following issues:
- Higher latency
- Hidden inefficiencies
- Vendor dependency
- Debugging difficulties
- Atomic deploys
We will address these drawbacks in detail now.
Serverless is event-driven and so your code is not running all the time. When a request is made, it triggers a service that finds your function, unzips the package, loads it into a container, and makes it available to be executed. The problem is that those steps take time: up to a few hundreds of milliseconds. This issue is called a cold start delay and is a trade-off that exists between the serverless cost-effective model and the lower latency of traditional hosting.
Another solution is to benefit from the fact that the cloud provider may cache the loaded code, which means that the first execution will have a delay but further requests will benefit from a smaller latency. You can optimize a serverless function by aggregating a large number of functionalities into a single function. The benefit is that this package will be executed with a higher frequency and will frequently skip the cold start issue. The problem is that a big package will take more time to load and provoke a higher first start time.
As a last resort, you could schedule another service to ping your functions periodically, such as once every 5 minutes, to prevent putting them to sleep. It will add costs, but it removes the cold start problem.
There is also a concept of serverless databases that references services where the database is fully managed by the vendor, and it costs only the storage and the time to execute the database engine. Those solutions are wonderful, but they add a second layer of delay for your requests.
If you go serverless, you need to know what the vendor constraints are. For example, on AWS, you can't run a Lambda function for more than 5 minutes. It makes sense because if you spend long time running code, you are using it the wrong way. Serverless was designed to be cost efficient in short bursts. For constant and predictable processing, it will be expensive.
Another constraint on AWS Lambda is the number of concurrent executions across all functions within a given region. Amazon limits this to 1,000. Suppose that your functions need 100 milliseconds on average to execute. In this scenario, you can handle up to 10,000 users per second. The reasoning behind this restriction is to avoid excessive costs due to programming errors that may create potential runways or recursive iterations.
AWS Lambda has a default limit of 1,000 concurrent executions. However, you can file a case into AWS Support Center to raise this limit. If you say that your application is ready for production and that you understand the risks, they will probably increase this value.
When monitoring your Lambda functions using Amazon CloudWatch (more in Chapter 10, Testing, Deploying, and Monitoring), there is an option called throttles. Each invocation that exceeds the safety limit of concurrent calls is counted as one throttle. You can configure a CloudWatch alert to receive an e-mail if this scenario occurs.
Some people see serverless as a NoOps solution. That s not true. DevOps is still necessary. You don t need to worry much about servers because they are second-class citizens and the focus is on your business. However, adding metrics and monitoring your applications will always be a good practice. It s so easy to scale that a specific function may be deployed with a poor performance that takes much more time than necessary and remains unnoticed forever because no one is monitoring the operation.
Also, over or under provisioning is also possible (in a smaller sense) since you need to configure your function, setting the amount of RAM memory that it will reserve and the threshold to timeout the execution. It s a very different scale of provisioning, but you need to keep it in mind to avoid mistakes.
When you build a serverless solution, you trust your business to a third-party vendor. You should be aware that companies fail and you can suffer downtimes, security breaches, and performance issues. Also, the vendor may change the billing model, increase costs, introduce bugs into their services, have poor documentation, modify an API forcing you to upgrade, and terminate services. A whole bunch of bad things may happen.
What you need to weigh is whether it s worth trusting in another company or making a big investment to build everything by yourself. You can mitigate these problems by doing a market search before selecting a vendor. However, you still need to count on luck. For example, Parse was a vendor that offered managed services with really nice features. It was bought by Facebook in 2013, which gave more reliability due to the fact that it was backed by a big company. Unfortunately, Facebook decided to shut down all servers in 2016, giving one year of notice for customers to migrate to other vendors.
Vendor lock-in is another big issue. When you use cloud services, it s very likely that one specific service has a completely different implementation than another vendor, making those two different APIs. You need to rewrite code in case you decide to migrate. It s already a common problem. If you use a managed service to send e-mails, you need to rewrite part of your code before migrating to another vendor. What raises a red flag here is that a serverless solution is entirely based in one vendor, and migrating the entire codebase can be much more troublesome.
To mitigate this problem, some tools such as the Serverless Framework support multiple vendors, making it easier to switch between them. Multivendor support represents safety for your business and gives power to competitiveness.
Unit testing a serverless solution is fairly simple because any code that your functions rely on can be separated into modules and unit tested. Integration tests are a little bit more complicated because you need to be online to test using external services.
When it comes to debugging to test a feature or fix an error, it s a whole different problem. You can't hook into an external service to see how your code behaves step-by-step. Also, those Serverless APIs are not open sourced, so you can't run them in-house for testing. All you have is the ability to log steps, which is a slow debugging approach, or extract the code and adapt it to host into your own servers and make local calls.
Deploying a new version of a serverless function is easy. You update the code and the next time that a trigger requests this function, your newly deployed code will be selected to run. This means that, for a brief moment, two instances of the same function can be executed concurrently with different implementations. Usually, that s not a problem, but when you deal with persistent storage and databases, you should be aware that a new piece of code can insert data into a format that an old version can't understand.
Also, if you want to deploy a function that relies on a new implementation of another function, you need to be careful in the order that you deploy those functions. Ordering is often not secured by the tools that automate the deployment process.
The problem here is that current serverless implementations consider that deployment is an atomic process for each function. You can't batch deploy a group of functions atomically. You can mitigate this issue by disabling the event source while you deploy a specific group, but that means introducing downtime into the deployment process. Another option would be to use a Monolith approach instead of a Microservices architecture for serverless applications.
Serverless is still a pretty new concept. Early adopters are braving this field, testing what works, and which kind of patterns and technologies can be used. Emerging tools are defining the development process. Vendors are releasing and improving new services. There are high expectations for the future, but the future isn't here yet. Some uncertainties still worry developers when it comes to building large applications. Being a pioneer can be rewarding, but risky.
Technical debt is a concept that compares software development with finances. The easiest solution in the short run is not always the best overall solution. When you take a bad decision in the beginning, you pay later with extra hours to fix it. Software is not perfect. Every single architecture has pros and cons that append technical debt in the long run. The question is: how much technical debt does serverless aggregate to the software development process? Is it more, less, or equivalent to the kind of architecture that you are using today?
In this section, we ll cover which use cases fit better for the serverless context and which you should avoid. As the concept is still evolving, there are still unmapped applications, and you should not be restricted. So feel free to exercise your creativity to think and try new ones.
Let's see the following few examples of static websites:
- Company website
- Online documentation
Hosting static files on a storage system is by far the best solution. It s cheap, fast, scalable, and highly available. There is no drawback. There is no function with cold start, no debugging, no uncertainties, and changing vendors is an easy task.
If you are thinking of using WordPress to build a static site, please reconsider. You would need to spin up a server to launch a web server and a database that stores data. You start paying a few dollars per month to host a basic site and that cost greatly increases with your audience. For availability, you would add another machine and a load balance, and the billing would cost at least dozens of dollars per month. Also, as WordPress is so largely used, it s a big target for hackers and you will end up worrying about periodic security patches for WordPress and its plugins.
So, how should you build a static site with a serverless approach? Nowadays, there are dozens of tools. I personally recommend Jekyll. You can host on GitHub pages for free, use Disqus to handle blog comments, and easily find many other plugins and templates. For my personal blog, I prefer to use Amazon because of its reliability and I pay just a few cents per month. If you want, you can also add CloudFront, which is a Content Delivery Network (CDN), to reduce latency by approximating users to your site files.
Once you learn how to build a serverless website, it s extremely fast to transform an idea into a running service, removing the burden of preparing an infrastructure. Following the lean philosophy, your prototype reaches the market to validate a concept with minimum waste and maximum speed.
In this section, I ve used the qualifier small. This is because there are many studies that correlate the time to load a page with the probability for the customer to buy something. A few tens of milliseconds later may result in loss of sales. As already discussed, serverless brings reduced costs, but the cold start delay may increase the time to render a page. User-facing applications must consider whether this additional delay is worth it.
If the e-commerce sells to a small niche of customers in a single country, it s very likely that the traffic is concentrated during the day and reduces to almost nothing at late night. This use case is a perfect fit for serverless. Infrequent access is where most of the savings happens.
A real story to back this use case was described on Reddit. Betabrand, a retail clothing company, made a partnership with Valve to sell some products to promote one game. Valve created a blog post to advertise the deal and after a few minutes, the website broke because it couldn t handle the instant peak of a massive number of users. Valve pulled out the post and Betabrand had the mission to improve their infrastructure in one weekend.
Betabrand solved the problem building a small website using serverless. Valve advertised them again and they were able to handle 500,000 users in 24 hours, with peaks of 5,000 concurrent users. The post starts saying that it had an initial cost of only US$ 0.07, but it was corrected in comments to US$ 4.00 for backend and US$ 80.00 to transfer large (non-optimized) images, which is still an impressive low cost for such a high traffic (source: https://www.reddit.com/r/webdev/3oiilb).
Consider, in this section, websites that are built just for short events, like conferences, that receive a big number of visitors. They need to promote the event, display the schedule and maybe collect e-mails, comments, photos, and other kinds of data. Serverless helps handling the scale and provides a fast development.
Another related use case is for ticketing websites. Suppose that a huge, popular concert will start selling tickets at midnight. You can expect a massive number of fans trying to buy tickets at the same time.
A common example is a mobile application that sends an image to a RESTful service. This image is stored and it triggers a function that will process it to optimize and reduce its size, creating different versions for desktop, tablet, and phones.
Most chatbots are very simple and designed for specific use cases. We don't build chatbots to pass the Turing test. We don't want them to be so complex and clever as to deceive a human that it's another human talking. What we want is to provide a new user interface to make it easier to interact with a system under certain conditions.
Instead of ordering a pizza through an application using menus and options, you can type a message like "I want a small pepperoni pizza" and be quickly done with your order. If the user types "Will it rain today?", it is perfectly fine for the pizza chatbox to answer "I couldn't understand. Which kind of pizza do you want for today? We have X, Y, and Z." Those broad questions are reserved for multipurpose AI bots such as Siri, Cortana, or Alexa.
Considering this restricted scenario, a serverless backend can be pretty useful. In fact, there is a growing number of demos and real-world applications that are using serverless to build chatbots.
Internet of Things (IoT) is a trending topic and many cloud services are providing tools to easily connect a huge number of devices. Those devices usually need to communicate through a set of simple messages, and they require a backend to process them. Thinking of this use case, Amazon offers AWS IoT as a serverless service to handle the broadcast of messages and AWS Lambda for serverless processing. Configuring and managing those services is so easy that they are becoming a common choice for IoT systems.
You can set up your code to be executed on a regular, scheduled basis. Instead of running a dedicated machine to execute a code, one per hour, which creates some database reads or small file processing, you can use serverless and save costs.
Actually, that s a great way to introduce new features using serverless into a running solution since scheduled events are usually composed of simple tasks in separated modules.
There is a growing number of applications that are substituting traditional big data tools such as Hadoop and Spark for serverless counterparts. Instead of managing clusters of machines, you can create a big data pipeline, converting your input to data streams and loading chunks of data into concurrent serverless functions.
The benefit of this approach is the reduced management and ease of use. However, as you have a constant processing of data, you can expect higher costs. Also, on AWS, a Lambda function can't run for more than 5 minutes, and this limit may force changes to reduce chunks of data to smaller sizes before processing.
Avoid applications that have the following features:
- CPU-intensive with long running tasks
- Constant and with predictable traffic
- Real-time processing
- Multiplayer-intensive games
Regarding multiplayer games, you can build a serverless backend that handles the communication between players with very low latency through serverless notifications. It enables turn-based and card games, but may not fit so well for first person shooters, for example, which require constant and frequent server-side processing.
In this chapter, you learned about the serverless model and how it is different from other traditional approaches. You already know what the main benefits are and the advantages that it may offer for your next application. Also, you are aware that no technology is a silver bullet. You know what kind of problems you may have with serverless and how to mitigate some of them.
Now that you ve understood the serverless model, we are ready to dive into the tools and services that you can use to build a serverless application. In the next chapter, you will learn which services AWS offers that can be considered as serverless, followed by a brief explanation on how they work and a set of code examples.