A frequently quoted view is that there is no such thing as API security; it’s just an evolution of application security we have been practicing for the last two decades. However, I believe that it is a discrete and important discipline. Join me on the journey into API security.
APIs are the backbone of a modern digital economy, allowing the exchange of critical data and the interconnectivity of different systems. APIs are the fuel that have fired digital innovation for the last decade. Given the critical role of APIs in our digital world, it is vital that they are secure. This chapter sets out the foundational concepts of APIs, particularly in relation to security.
In this chapter, we will examine exactly what is meant by API security and understand the key elements of this exciting and emerging security domain, covering topics such as the following:
The Open Web Application Security Project (OWASP) published its first OWASP API Security Top 10 list in December 2019, and since then, the API security community has grown rapidly, with API security start-ups attracting significant investment and increasing interest from developers and security practitioners alike to learn resources on the topic. Unfortunately, during this period, there has also been a marked rise in the number of security incidents relating to insecure APIs. Recent analysis suggests a 681% rise in API attacks and that nearly one in two organizations has experienced a security incident related to APIs.
In a way, APIs are a victim of their own success – because of their rapid proliferation and the high economic value of the data they protect, they are now the most popular target for attackers.
We’ll now have a look at the so-called API economy and how the near-exponential growth in the number of APIs creates challenges for organizations as they become the favorite vector for attackers.
To fully appreciate the importance of API security, let us first consider the growth of the so-called API economy. Let’s understand a bit more about what is meant by this term – Forbes defines an API economy as “an enabler for turning a business or organization into a platform.” A platform can leverage APIs to do the following:
The API revolution has led to the emergence of API-first businesses such as Twilio and has allowed other organizations to expose their core offerings via APIs (Google Maps is a good example). The disruptive nature of the API economy is best seen in the financial services industry – typically, this has been an industry resistant to innovation due to regulatory and compliance requirements. By using APIs to expose selected core services, banks can embrace new models without disrupting their core IT systems. By adopting open standards that can be certified – such as the Open Banking API – banks can achieve interoperability while ensuring transactional integrity. The online money transfer service Wise uses APIs to provide B2B and B2C services and offers banking-as-a-service (BaaS) to third parties by renting out their APIs.
There are several key benefits to an API economy:
API adoption allows organizations to deliver more value and functionality while simultaneously reducing cost and time to market.
It is difficult to provide an accurate estimation of the scale of an API economy, and even if it was possible, this estimate would soon be invalidated due to the nearly exponential growth of the space. The API community at Nordic APIs (https://nordicapis.com/20-impressive-api-economy-statistics/) has produced a survey on the scale of the API economy; the following are some headline figures:
On the back of a growing API economy, major capital investment has poured into the market for API tool vendors, management platforms, and security tools.
The rapid adoption of APIs brings with it several challenges in addition to the benefits. The first challenge is that of inventory — because APIs can be easily built and deployed and have a finite lifetime, organizations are struggling to keep track of their API inventory, resulting in shadow (hidden) and zombie (outdated) APIs.
The second challenge is that of governance — as APIs proliferate, organizations face challenges with governing the development and deployment process, ensuring that data and privacy requirements are met and that the API life cycle is managed from cradle to grave.
The biggest challenge, however, is that of security. As noted earlier in this section, APIs can reduce an organization’s overall attack surface; however, this comes at the cost of a new security paradigm – APIs are a new attack surface, and the threats are different. In the next section, we’ll explore these security challenges in more detail.
Developers love APIs — nearly all developers work with APIs and nearly all modern architectures are API-centric. While containerization has driven the breakdown of the monolith and the emergence of microservices, it is APIs that form the connecting tissue between these services.
The benefit of APIs to developers are numerous, including the following:
These factors have fueled the API-first paradigm where applications are built in a bottom-up approach, starting with the APIs, then the business logic, and the user interface (UI) last.
While APIs are undoubtedly popular with developers, they are even more popular with attackers. Gartner reports that APIs are the number one attack vector for cybercriminals in 2022, and barely a week goes by without an API breach or vulnerability being disclosed.
There are several key reasons why APIs are a favored attack target:
The relatively recent emergence of APIs as the de facto conduit for application connectivity poses significant challenges to security teams and testers. Much of the existing application security (AppSec) tooling that exists was designed in an era when web applications were the primary asset to be protected. Common security tools such as static application security testing (SAST), dynamic application security testing (DAST), and software composition analysis (SCA) are far less effective in assessing APIs than they are with web or mobile applications.
Traditional perimeter protections such as network firewalls or web application firewalls (WAFs) are ineffective in protecting APIs, since they lack the context of the API interface and the expected request and response traffic. Such tools tend to be high in both false positives and false negatives.
More modern API technologies, such as API management portals (APIMs) and gateways, are essential for the operation of APIs at scale, but while they do provide security features, they do not address all attack vectors.
The key takeaway is that while tools are important as part of a defense strategy, they need to be augmented by solid defensive design and coding techniques — this is the focus of the final section of this book.
It is important to understand why insecure code exists in the first place if we want to address the problem.
Developers are, by nature, creative problem solvers who thrive on a challenge – unfortunately, this leads them to be over-optimistic, which can lead them to take shortcuts and optimizations, or perhaps work to unrealistic delivery schedules. This is so-called happy path coding, where developers do not fully appreciate how their code could fail or be misused by an attacker, sometimes with dire consequences.
Coupled with over-optimism is a sense of over-confidence – developers will assume they fully understand a problem but may be unaware that they are missing some crucial detail or subtlety, which again can have adverse effects. An example is the adoption of a new API framework and not carefully considering the default settings and deploying a vulnerable product.
Developers will often have a misplaced sense that bad things only happen to other people and not them. Despite witnessing examples of well-known breaches, many developers believe they will never fall victim to a similar misfortune. This general phenomenon is known as the schadenfreude effect.
The development process can be stressful with constant pressure to deliver to schedule, and this can result in compromising full implementation in favor of meeting deadlines. For example, this can include the omission of error handling code or data validation with the intent of coming back to implement them in later releases. With time pressures, this rarely happens, and code is often left in an incomplete state.
Often, developers inherit a code base to maintain that may contain significant technical debt or legacy code. Without a full understanding of the system and its complexities and foibles, developers may be disinclined to make changes to the code base in case they break functionality.
Before we can understand how to secure APIs, we need to dive into the building blocks of APIs. This section will cover the somewhat challenging topics of cryptography, hashing and signatures, encoding, and transport layer security. We will not go into a lot of detail, but it is important to grasp these basics.
Public APIs are exposed to the internet and can easily be discovered by adversaries. One of the simplest attacks against an API is a denial-of-service (DoS) attack, in which automation is used to repeatedly and persistently attempt to access an API. Sustained DoS attacks can lead to the exhaustion of server resources, leading to a failure of the API or, most commonly, denying legitimate access to the API.
Brute-force attacks can also be used in account takeover (ATO) attacks, where either a sign-up endpoint or a password reset endpoint are flooded in attempts to guess passwords or hashes, using a dictionary attack (where a list of commonly used passwords is used).
Both types of attacks can be mitigated using rate-limiting technology, which limits repeated and frequent access from a particular IP address to a given endpoint. Rate-limiting applies a timeout window on the transactions and will return a 429 Too Many
Requests
error.
Cryptography is a foundational element in securing data electronically – most simply, it is a mathematical transformation applied to data. Typically, cleartext (unencrypted) is transformed into cyphertext (encrypted), using an algorithm and a key. The cyphertext is no longer recognizable as the original cleartext and cannot be reverse-engineered to reveal the original cleartext without using the inverse transformation (decrypted), using the same algorithm and the key.
The choice of algorithm depends on the application; two broad types of algorithms are used:
A fundamental challenge with cryptography is the exchange (and management) of keys between both parties. To this end, robust key-exchange protocols have been developed to securely exchange keys that prevent an eavesdropper from accessing keys in transit. The Diffie-Hellman exchange is the most used protocol.
Cryptography provides the following benefits:
An important application of cryptography principles relates to ensuring the integrity of messages in transit.
Hashes are the most elementary technique, in which a block of data is passed through an algorithm to produce a digest of the data; typically this is a fixed-length string much shorter than the input data. Common hashing algorithms include SHA2 and MD5. Key properties of hash functions are that they are one-way functions or irreversible (the input cannot be obtained from the digest, and the digests are unique so that no two blocks of data will produce the same digest). Hashes are used to verify the integrity of data.
Hashed Authentication Message Codes (HMACs) are similar to hashes in that they produce a digest that is then encrypted with a symmetric algorithm and passed to the recipient. If the recipient has the correct key, they can decrypt the digest and verify the integrity of the data, and also the authenticity of the sender (via their shared key).
Signatures are the final piece of the puzzle – similar to HMACs, these use an algorithm to encrypt the digest; however, in this case, it is an asymmetric algorithm. The private key is used for encryption, and at the receiver end, the public key of the sender is used to decrypt and verify the integrity. Using robust principles of public key infrastructure (PKI), public keys can be trusted (their ownership can be verified).
The following table summarizes the differences between the three types:
Objective |
Hash |
HMAC |
Signature |
Integrity |
|
|
|
Authenticity |
|
|
|
Non-repudiation |
|
Table 1.1 – A comparison of digest types
The TLS protocol is a transport-level cryptographic protocol to ensure secure communications over a TCP/IP network. An encrypted transport layer is essential for APIs to ensure that attackers are unable to eavesdrop on data or tokens over the network and to ensure that the client can validate the identity of the server (via certificate validation). Certificate management has usually presented challenges to organizations; however, with the emergence of providers such as Let’s Encrypt, certificate deployment and management have become a lot simpler.
The final building block is that of encoding, which involves changing the representation of data for the purposes of storage or transmission. Encoding converts the character set of input data to a format that can be safely stored or transmitted, and decoding converts that data back to its original format.
This concept is best understood by looking at a few common encoding schemes:
<
and >
. If a text block contains these characters, it will change the structure of the rendered HTML, which is undesirable. By encoding these special characters in another format ("<"
and ">"
), they can be safely rendered in an HTML document, where they will be displayed correctly as <
and >
but stored in a different form.%20
.Encoding does not use a key to perform the transformation but, rather, a fixed algorithm, and any content that has been encoded can be decoded to produce exactly the same original content.
Encoding versus encryption
A common misunderstanding is the difference between encoding and encryption. They are two very different topics, solving different problems.
Encoding transforms data from one representation to another using a fixed algorithm. No keys are used, and the encoded data can be trivially converted back to the original format. It does not offer any form of integrity or confidentiality functions.
Encryption performs a transformation of data using a key; the resultant output does not resemble the input at all, and the only way the original data can be obtained is by applying the reverse decryption function using the same key. Encryption does not transform the representation or character set of the data.
Finally, in this section, let’s take a quick look at common data formats used in APIs. For REST APIs, information is transferred in plain text format (although this information may be encoded), either as key-value pairs as request parameters, one or more headers, or as an optional request body. Responses consist of a status and an optional response body.
eXtensible Markup Language (XML) is the original heavyweight format for internet data storage and transmission. The format is designed to be agnostic of data type, separates data from presentation, and is of course extensible, not being reliant on any strict schema definition (unlike HTML, which uses fixed tags and keywords).
Although XML was dominant several years ago, it suffered from some significant drawbacks, namely complexity and large data payloads. These two factors make it difficult to process and parse XML on resource-limited systems. XML is still encountered, although much less so in APIs.
A simple example of XML shows the basic structure of tags and values:
<note> <to>Colin</to> <priority>High</priority> <heading>Reminder</heading> <body>Learn about API security</body> </note>
Javascript Object Notation (JSON) is now the dominant transfer format for data over HTTP, particularly in REST APIs. JSON originated as a lightweight alternative to the more heavyweight XML format, being particularly efficient with transmission bandwidth and client-side processing.
Data is represented by key-value pairs, with integer, null, Boolean, and string data types supported. Keys are delimited with quotes, as are strings. Records can be nested, and array data is supported. Comments are not permitted in JSON data.
A simple example of JSON shows the key-value pair structure:
{ "name": "Colin", "age": 52, "car": null }
YAML Ain’t Markup Language (YAML) is another common internet format, similar to JSON in its design goals. YAML is in fact a superset of JSON, with the addition of some processing features. JSON can be easily converted to YAML, and often, they are used interchangeably, depending on personal preference, particularly for OpenAPI definitions.
The same data from the JSON example can be expressed in YAML as follows:
--- name: Colin age: 52 car:
The final format we need to understand is the OpenAPI Specification (OAS), which is a human-readable (and machine-readable) specification for defining the behavior of an API. The OpenAPI Specification is an open standard run under the auspices of the OpenAPI Initiative. Previously, the standard was known as Swagger (aka version 2) but has now been formalized into an open standard, and currently, version 3.0 is in general use, with version 3.1 due imminently at the time of writing.
An OAS definition can be expressed either as YAML or JSON and comprises several sections, as shown here:
Figure 1.1 – OpenAPI Specification sections
Using an OAS definition at the inception of the API life cycle (referred to as design-first) offers several key benefits, namely the following:
A sample OAS definition is shown in the following snippet. This is an example of a bare-minimum specification of an API and includes the following in the header section:
{ "openapi": "3.0.0", "info": { "version": "1.0.0", "title": "Swagger Petstore", "license": { "name": "MIT" } }, "servers": [ { "url": http://petstore.swagger.io/v1 }], ..
The next section in the OAS definition describes an endpoint, showing details such as the following:
"paths": { "/pets": { "get": { "summary": "List all pets", "operationId": "listPets", "parameters": [ { "name": "limit", "in": "query", "description": "Maximum items (max 100)", "required": false, "schema": { "type": "integer", "format": "int32" } } ], "responses": { "200": { "description": "A paged array of pets", "headers": { "x-next": { "description": "Next page", "schema": { "type": "string } } }, "content": { "application/json": { "schema": { "$ref": "#/components/schemas/Pets" } } } }, ..
At this point, we understand the building blocks of APIs and the associated data formats. It is now time to look at the elements of API security.
API security is a complex topic and comprises many elements — a successful API security initiative should be built upon a solid foundation of a DevOps practice and a balanced AppSec program. Just like a house, the strength of the overall structure is dependent on a solid foundation – without these in place, an API security initiative may prove challenging.
Good security is built on a multi-layer system – this is the defense in-depth approach.
It is important to remember that API security is quite different from what has come before with web application security. This means that using existing tools and practices may be insufficient to produce secure APIs. Dedicated API security solutions must be deployed in addition to traditional AppSec tools to provide the optimum coverage and protection specific to APIs.
The elements of the API security hierarchy are shown here:
Figure 1.2: The elements of API security
Let’s explore each of the layers of API security briefly.
DevOps is a well-established set of practices to facilitate modern software systems, characterized by close relationships between the development and operations teams to improve methodology and practices and leverage the benefits of automation. DevOps is considered a continuous process with continuous improvements across several key domains in the Software Development Lifecycle (SDLC), as shown here:
Figure 1.3: The DevOps cycle
DevOps offers many benefits to the delivery of software, including the following:
From the perspective of API security, the key benefit of DevOps is the ability to build APIs in a deterministic fashion using a standard process. Using standard Continuous Integration / Continuous Delivery (CI/CD) pipelines, API security testing and validation tooling can be injected into the build process to ensure that all deployed APIs have had the specified security checks and controls applied to them. APIs by their nature are well suited to automated testing, and the CI/CD pipeline is the ideal place for this activity.
Static application security testing (SAST), dynamic application security testing (DAST), software composition analysis (SCA), and web application firewalls (WAFs) form the vanguard of traditional application security programs.
The security of any software can be improved by the judicious use of such tools, as follows:
SAST can detect common coding vulnerabilities in API code (such as injection flaws) but will not detect API-specific flaws (such as broken authentication or authorization), since the SAST engine does not have contextual awareness of the underlying API code. Similarly, DAST is able to detect certain API vulnerabilities (such as a lack of rate limiting) but lacks the context to understand the API requests and responses.
WAFs are a mature technology for protecting web applications and offer some protection for APIs as well. They operate in line with traffic utilizing a so-called allow list to block suspected malicious traffic and allowing everything else. They can be configured to operate in monitor mode (passive) or blocking mode (active).
Organizations typically have dedicated security teams tasked with deploying and operating these tools within development teams. These teams should evaluate dedicated API security tools to complement some of the gaps that exist with these tools.
API gateways are the workhorse of the API industry, providing a unified external interface to public clients and traffic routing to the relevant internal API backends after having performed transformation and conversion. Gateways are also responsible for network-level controls such as SSL termination, rate-limiting, IP address restrictions, and load balancing. Gateways can also implement security features such as JWT validation and identity management.
Some of the shortcomings of API gateways include the following:
Typically, API management portals provide a level of API management on top of a gateway, allowing organizations to control their inventory, versioning, life cycle, and end-user experience by providing API catalogs.
Some of the shortcomings of API management platforms include the following:
Both API management portals and gateways are vital components of an API security strategy, but their limitations should be borne in mind as part of the overall strategy.
The growth of API adoption has spawned several dedicated API security platforms, with the specific intent of addressing API security as a first-class citizen.
These platforms take different perspectives of securing APIs, including the following:
Dedicated API security tools are vital to providing the final layer of API security. Now that we understand the elements of API security, let us conclude this chapter by setting API security goals.
Finally, in this chapter, let’s focus on the security goals that should be considered in API security initiatives. Different organizations will have different security priorities based on their business priorities – a financial service organization will favor high levels of security and strict governance, while a social media portal may have lower security requirements and favor feature delivery instead. No two organizations have the same goals.
The term API security has a broad scope, meaning different things to the beholder. IT security has traditionally used the CIA triad to characterize risks to systems. CIA is an acronym for Confidentiality, Integrity, and Availability, and has applications in APIs as follows:
While a useful framework for considering API security, it should be considered in combination with the OWASP API Security Top 10 covered in Chapter 3.
When considering API security, we primarily consider hacks or breaches where an adversary deliberately attacks an API and causes it to misoperate, due to inherent flaws. Such attacks are deliberately focused on using techniques we will explore in the Attacking APIs section.
However, there is another category of API security risk to be considered – namely, the abuse and misuse of APIs. Typically, in this category, we consider automated scripting, bot attacks, scrapers, and nuisance actors. While they do not have a high-risk rating (according to the CIA triad, for instance), they can have detrimental consequences for organizations.
Some typical examples include the following:
Some of these types of abuse cases can be relatively difficult to either defend against (because they appear to be no different from normal users) or to detect.
Data governance is tangential to API security but a key consideration for a holistic API security strategy. APIs are primarily conduits for data transfer between internal systems or organizations and consumers or partners. APIs simplify the ability of developers to expose increasing amounts of data almost at the click of a button. However, with this ease comes an increased risk of inadvertent or unintended data leakage, causing regulatory and compliance concerns.
A solid data governance program is essential to ensure that consumers (typically, API developers in this context) have full awareness of the data sensitivity and classification and apply the relevant controls to limit access, in line with regulatory and compliance concerns.
This is particularly applicable to the financial services and the medical industry, which increasingly face data disclosures via APIs.
Unlike web or mobile applications, APIs present a tremendous opportunity to radically shift the security paradigm. Traditionally, web or mobile security has relied on a negative security model, which means that a known bad actor is blocked, and everything else is allowed. Here, a deny list approach is used.
This approach – while long-established – has a significant disadvantage in that defenders do not know the full extent of all known bad actors. Clever attackers can construct payloads or inputs that appear to be valid inputs passing through the deny list; however, in the context of the application, they are dangerous. Think of the example of SQL Injection (SQLi) attacks where seemingly innocuous input is applied to a database, where it can have catastrophic consequences. The negative security model is characterized by both high false positives and false negatives.
API security turns this model around entirely, relying instead on an allow list that passes only known good actors to the API backend. This is the positive security model, which only allows data and operations specified by the OpenAPI contract to access the API. Anything else is simply blocked (via an API firewall, for example) before reaching the API. This approach offers a massive benefit for security — the instances of both false positives and negatives are greatly reduced. The positive security model has one major drawback, however – it is reliant on a fully formed OpenAPI contract to operate correctly. This may be challenging for organizations not embracing an API design-first approach. However, the positive security model promises to be game-changing for the world of API security.
Finally, let’s conclude with an approach for prioritizing API security initiatives, which can be costly and time-consuming in large organizations. Probably the most frequently asked question is “Where do I start?” – security leaders are often stuck in a quandary when faced with a choice of trying to address their entire API portfolio (at great cost and a higher likelihood of failure), erroneously focusing on less important APIs (and wasting valuable security resources), or in extreme cases, simply not starting at all due to the enormity of the undertaking.
A common-sense approach to prioritizing an API security initiative is to use a risk-based methodology – start with the highest-risk APIs and work through to the lower risks, as budget permits.
Priority is dependent on several parameters, typically the following:
Combining these three factors allows us to gain an approximate risk-based priority:
Figure 1.4 – Prioritizing API security via a risk profile
As a (slightly contrived) example, an unauthenticated API on a public network conveying medical records scores the maximum risk and, hence, becomes a priority.
While this is, at best, an approximate risk rating, it serves well to focus security activities where they will achieve the maximum return on investment.
We have covered a lot in this first chapter. APIs are the lifeblood of a modern application economy and, unfortunately, are a favorite target for attackers due to the high value of data that they convey. We now understand the core building blocks of APIs, how developers are vital in the journey to building secure APIs, and the various elements of a secure API initiative.
In the next chapter, we are going to explore in greater detail exactly how an API is constructed, how they work, and critically, how they are secured.
Where there is an eBook version of a title available, you can buy it from the book details for that title. Add either the standalone eBook or the eBook and print book bundle to your shopping cart. Your eBook will show in your cart as a product on its own. After completing checkout and payment in the normal way, you will receive your receipt on the screen containing a link to a personalised PDF download file. This link will remain active for 30 days. You can download backup copies of the file by logging in to your account at any time.
If you already have Adobe reader installed, then clicking on the link will download and open the PDF file directly. If you don't, then save the PDF file on your machine and download the Reader to view it.
Please Note: Packt eBooks are non-returnable and non-refundable.
Packt eBook and Licensing When you buy an eBook from Packt Publishing, completing your purchase means you accept the terms of our licence agreement. Please read the full text of the agreement. In it we have tried to balance the need for the ebook to be usable for you the reader with our needs to protect the rights of us as Publishers and of our authors. In summary, the agreement says:
If you want to purchase a video course, eBook or Bundle (Print+eBook) please follow below steps:
Our eBooks are currently available in a variety of formats such as PDF and ePubs. In the future, this may well change with trends and development in technology, but please note that our PDFs are not Adobe eBook Reader format, which has greater restrictions on security.
You will need to use Adobe Reader v9 or later in order to read Packt's PDF eBooks.
Packt eBooks are a complete electronic version of the print edition, available in PDF and ePub formats. Every piece of content down to the page numbering is the same. Because we save the costs of printing and shipping the book to you, we are able to offer eBooks at a lower cost than print editions.
When you have purchased an eBook, simply login to your account and click on the link in Your Download Area. We recommend you saving the file to your hard drive before opening it.
For optimal viewing of our eBooks, we recommend you download and install the free Adobe Reader version 9.