The Role of Cryptography in the Connected World
In this introductory chapter, we try to provide some answers to the following questions:
Why are there so many insecure IT systems?
How can cryptography help to mitigate our security problems?
Our core argument is that the simultaneous growth of connectivity and complexity of IT systems has led to an explosion of the attack surface, and that modern cryptography plays an important role in reducing that attack surface.
After a brief discussion of how the field of cryptography evolved from an exotic field appreciated by a select few to an absolutely critical skill for the design and operation of nearly every modern IT product, we will look at some recent real-world security incidents and attacks in order to illustrate our claim. This will allow you to understand why cryptography matters from a higher strategic perspective.
1.1 Evolution of cryptography
Over the past four decades or so, cryptography has evolved from an exotic field known to a select few into a fundamental skill for the design and operation of modern IT systems. Today, nearly every modern product, from the bank card in your pocket to the server farm running your favorite cloud services, requires some form of cryptography to protect it and its users against cyberattacks. Consequently, it has found its way into mainstream computer science and software engineering.
Figure 1.1: Number of publications at IACR conferences on cryptology over the years
Cryptography and its counterpart cryptanalysis were basically unknown outside of military and intelligence services until the mid 1970s. According to , Cryptography is the practice and study of techniques for secure communication in the presence of adversaries; it deals with the development and application of cryptographic mechanisms. Cryptanalysis is the study of cryptographic mechanisms’ weaknesses, aimed at finding mathematical ways to render these mechanisms ineffective. Taken together, cryptography and cryptanalysis form what’s called cryptology.
In 1967, David Kahn, an American historian, journalist, and writer, published a book titled The Codebreakers – The Story of Secret Writing, which is considered to be the first extensive treatment and a comprehensive report of the history of cryptography and military intelligence from ancient Egypt to modern times . Kahn’s book introduced cryptology to a broader audience. Its content was, however, necessarily restricted to symmetric cryptography. In symmetric cryptography, the sender and receiver of a message share a common secret key and use it for both encrypting and decrypting. The problem of how sender and receiver should exchange the secret in a secure way was considered out of scope.
This changed in 1976, when the seminal paper New Directions in Cryptography by Whitfield Diffie and Martin Hellman appeared in volume IT-22 of IEEE Transactions on Information Security . In that publication, Diffie and Hellman described a novel method for securely agreeing on a secret key over a public channel based on the so-called discrete logarithm problem. Moreover, they suggested for the first time that the sender and receiver might use different keys for encrypting (the public key) and decrypting (the private key) and thereby invented the field of asymmetric cryptography.
Figure 1.2: From left to right: Ralph Merkle, Martin Hellman, Whitfield Diffie 
While there were scientific works on cryptography dating back to the early 1970s, the publication by Diffie and Hellman is the first publicly available paper in which the use of a private key and a corresponding public key is proposed. This paper is considered to be the start of cryptography in the public domain. In 2002, Diffie and Hellman suggested their algorithm should be called Diffie-Hellman-Merkle key exchange because of Ralph Merkle’s significant contribution to the invention of asymmetric cryptography .
In 1977, the three MIT mathematicians Ron Rivest, Adi Shamir, and Len Adleman took up the suggestion by Diffie and Hellman and published the first asymmetric encryption algorithm, the RSA algorithm , which is based on yet another well-known mathematical problem, the factoring problem for large integers.
Figure 1.3: From left to right: Adi Shamir, Ron Rivest, Len Adleman 
The invention of asymmetric cryptography did not make symmetric cryptography obsolete. On the contrary, both fields have complementary strengths and weaknesses and can be efficiently combined in what is today called hybrid cryptosystems. The Transport Layer Security (TLS) protocol is a very good example of a hybrid cryptosystem.
Today, cryptography is a well-known (albeit mostly little understood in depth) topic in the IT community and an integral part of software development. As an example, as of July 2022, the OpenSSL library repository on GitHub contains over 31,500 commits by 686 contributors. Cryptography is also an integral part of numerous computer science and information security curricula, and numerous universities all over the world offer degrees in information security.
Why did this happen, and which factors led to this development and popularized cryptography within a comparably short period of time? To a large extent, this paradigm shift is a result of three—arguably still ongoing—developments in information technology that radically changed the role of cryptography in the modern connected world:
The advent of the internet and the ever increasing need to transfer large amounts of data over untrusted channels, which also fostered the development of TLS
The introduction of connectivity into nearly every new product, from toothbrushes to automobiles
The ever increasing complexity of IT systems, specifically increasing hardware and software complexity
1.2 The advent of TLS and the internet
We’ll now turn to the original theme of this book, TLS and the cryptographic tools it is made of. TLS is a protocol designed to protect data sent over the internet, so we’ll start with a brief look into the early history of the internet.
Despite its origins as a research project financed by the Defense Advanced Research Projects Agency (DARPA), the research agency of the Department of Defence of the United States, most of the main physical components of the internet, such as cables, routers, gateways, and so on, can be (and are) accessed by untrusted third parties. In the early days of the internet, this was not considered a problem, and very few (if any) security measures were introduced into TCP and IP, the internet’s main protocol workhorses, and none of them involved cryptography. However, with more and more people using the internet, and the ever increasing available bandwidth, more and more services kept appearing on the internet, and it was quickly realized that to do real business over the internet, a certain amount of trust was needed that sensitive data such as credit card numbers or passwords did not fall into the wrong hands. Cryptography provides the answer to this problem, because it can guarantee confidentiality (i.e., no one can read the data in transit) and authenticity (i.e., you can verify that you are talking to the right party). TLS and its predecessor SSL are the protocols that implement cryptography on the internet in a secure, usable way.
Starting in 1995, SSL was shipped together with Netscape Navigator to clients. While server-side adoption of SSL was slow at first, by the end of 2021, according to the Internet Security Research Group (ISRG), 83% of web pages loaded by Firefox globally used HTTPS, that is HTTP secured via TLS .
Figure 1.4: Percentage of web pages loaded by Firefox using HTTPS 
This is a huge success for TLS and the field of cryptography in general, but with it also comes a huge responsibility: we need to constantly monitor whether the algorithms, key lengths, modes of operations, and so on used within TLS are still secure. Moreover, we need to understand how secure algorithms work and how they can interact with each other in a secure way so that we can design secure alternatives if needed.
Maybe we should already stress at this early stage that TLS is not a remedy for all the problems mentioned here. TLS provides channel-based security, meaning that it can only protect data in transit between a client and a server. TLS is very successful in doing so, and how in detail TLS uses cryptography to achieve this goal is the main theme of this book. However, once the data leaves the secure channel, it is up to the endpoints (i.e., client and server) to protect it.
Moreover, cryptography by itself is useless in isolation. To have any practical effect, it has to be integrated into a much larger system. And to ensure that cryptography is effectively protecting that system, there must be no security holes left that would allow an attacker to circumvent its security.
There is a well-known saying among cybersecurity professionals that the security of a system is only as strong as its weakest link. Because there are so many ways to circumvent security – especially in complex systems – cryptography, or rather the cryptographic primitives a system uses, is rarely the weakest link in the chain.
There is, however, one important reason why cryptography is fundamental for the security of information systems, even if there are other security flaws and vulnerabilities. An attacker who is able to break cryptography cannot be detected because a cryptanalytic attack, that is, the breaking of a cryptographic protocol, mechanism or primitive, in most cases leaves no traces of the attack.
If the attacker’s goal is to read the communication, they can simply passively listen to the communication, record the messages and decrypt them later. If the attacker’s goal is to manipulate the target system, they can simply forge arbitrary messages and the system will never be able to distinguish these messages from benign ones sent by legitimate users.
While there are many other sources of insecurity (e.g., software bugs, hardware bugs, and social engineering), the first line of defense is arguably secure communication, which in itself requires a secure channel. And cryptography as a scientific discipline provides the building blocks, methods, protocols, and mechanisms needed to realise secure communication.
1.3 Increasing connectivity
At the same time, connectivity makes it much harder to build secure systems. Similar to Ferguson and Schneier’s argument on security implications of complexity, one can say that there are no connected systems that are secure. Why? Because connecting systems to large, open networks like the internet exposes them to remote attacks. Remote attacks – unlike attacks that require physical access – are much more compelling from the attacker’s perspective because they scale.
1.3.1 Connectivity versus security – larger attack surface
While connectivity enables a multitude of desired features, it also exposes products to remote attacks carried out via the internet. Attacks that require physical access to the target device can only be executed by a limited number of attackers who actually have access to that device, for example, employees of a company in the case of devices in a corporate network. In addition, the need for physical access generally limits the attacker’s window of opportunity.
Connectivity, in contrast, exposes electronic devices and IT systems to remote attacks, leading to a much higher number of potential attackers and threat actors. Moreover, remote attacks – unlike attacks that require physical access to the target – are much more compelling from the attacker’s perspective because they scale.
Another aspect that makes remote attacks practical (and, to a certain extent, rather easy) is the fact that the initial targets are almost always the network-facing interfaces of the devices, which are implemented in software. As we have seen, complex software is almost guaranteed to contain numerous implementation bugs, a number of which can be typically exploited to attack the system. Thus, the trend of increasing software and system complexity inadvertently facilitates remote attacks.
1.3.2 Connectivity versus marginal attack cost
Remote attacks are easy to launch – and hard to defend against – because their marginal cost is essentially zero. After a newly discovered security vulnerability is initially translated into a reliably working exploit, the cost of replicating the attack an additional 10, 100, or 100,000 devices is essentially the same, namely close to zero.
This is because remote attacks are implemented purely in software, and reproducing software as well as accessing devices over public networks effectively costs close to nothing. So, while businesses need to operate large – and costly – internal security organizations to protect their infrastructure, services, and products against cybersecurity attacks, any script kiddie can try to launch a remote attack on a connected product, online service, or corporate infrastructure essentially for free.
1.3.3 Connectivity versus scaling attacks
To summarize, connectivity exposes devices and IT systems to remote attacks that target network-facing software (and, thus, directly benefit from the continuously increasing software complexity), are very cheap to launch, can be launched by a large number of threat actors, and have zero marginal cost.
In addition, there exists a market for zero-day exploits  that allows even script kiddies to launch highly sophisticated remote attacks that infest target systems with advanced malware able to open a remote shell and completely take over the infested device.
1.4 Increasing complexity
While it can be argued that the problem of increasing complexity is not directly mitigated by modern cryptography (in fact, many crypto-related products and standards suffer from this problem themselves), there is no doubt that increasing complexity is in fact a major cause of security problems. We included the complexity problem in our list of crucial factors for the development of cryptography, because cryptography can help limit the damage caused by attacks that were in turn caused by excessive complexity.
Following Moore’s law , a prediction made by the co-founder of Fairchild Semiconductor and Intel Gordon Moore in 1965, the number of transistors in an integrated circuit, particularly in a microprocessor, kept doubling roughly every 2 years (see Figure 1.5).
Figure 1.5: Increasing complexity of hardware: Transistors. Data is taken from https://github.com/barentsen/tech-progress-data
Semiconductor manufacturers were able to build ever bigger and ever more complex hardware with ever more features. This went so far that in the late 1990s, the Semiconductor Industry Association set off an alarm in the industry when it warned that productivity gains in Integrated Circuit (IC) manufacturing were growing faster than the capabilities of Electronic Design Automation (EDA) tools used for IC design. Entire companies in the EDA area were successfully built on this premise.
Continuously growing hardware resources paved the way for ever more complex software with ever more functionality. Operating systems became ever more powerful and feature-rich, the number of layers in software stacks kept increasing, and software libraries and frameworks used by programmers became ever more comprehensive. As predicted by a series of software evolution laws formulated by early-day computer scientists Manny Lehman and Les Belady, software exhibited continuing growth and increasing complexity  (see also Figure 1.6).
Figure 1.6: Increasing complexity of software: Source Lines of Code (SLOC) in operating systems. Data is taken from https://github.com/barentsen/tech-progress-data
Why should increasing complexity be a problem? According to leading cybersecurity experts Bruce Schneier and Niels Ferguson , ”Complexity is the worst enemy of security, and it almost always comes in the form of features or options”.
While it might be argued whether complexity really is the worst enemy of security, it is certainly true that complex systems, whether realized in hardware or software, tend to be error-prone. Schneier and Ferguson even claim that there are no complex systems that are secure.
Complexity negatively affects security in several ways, including the following:
Insufficient testability due to a combinatorial explosion given a large number of features
Unanticipated—and unwanted—behavior that emerges from a complex interplay of individual features
A high number of implementation bugs and, potentially, architectural flaws due to the sheer size of a system
1.4.1 Complexity versus security – features
The following thought experiment illustrates why complexity arising from the number of features or options is a major security risk. Imagine an IT system, say a small web server, whose configuration consists of 30 binary parameters (that is, each parameter has only two possible values, such as on or off). Such a system has more than a billion possible configurations. To guarantee that the system is secure under all configurations, its developers would need to write and run several billion tests: one test for each relevant type of attack (e.g., Denial-of-Service, cross-site scripting, and directory traversal) and each configuration. This is impossible in practice, especially because software changes over time, with new features being added and existing features being refactored. Moreover, real-world IT systems have significantly more than 30 binary parameters. As an example, the NGINX web server has nearly 800 directives for configuring how the NGINX worker processes handle connections.
1.4.2 Complexity versus security – emergent behavior
A related phenomenon that creates security risks in complex systems is the unanticipated emergent behavior. Complex systems tend to have properties that their parts do not have on their own, that is, properties or behaviors that emerge only when the parts interact . Prime examples for security vulnerabilities arising from emergent behavior are time-of-check-to-time-of-use (TOCTOU) attacks exploiting concurrency failures, replay attacks on cryptographic protocols where an attacker reuses an out-of-date message, and side-channel attacks exploiting unintended interplay between micro-architectural features for speculative execution.
1.4.3 Complexity versus security – bugs
Currently available software engineering processes, methods, and tools do not guarantee error-free software. Various studies on software quality indicate that, on average, 1,000 lines of source code contain 30-80 bugs . In rare cases, examples of extensively tested software were reported that contain 0.5-3 bugs per 1,000 lines of code .
However, even a rate of 0.5-3 bugs per 1,000 lines of code is far from sufficient for most practical software systems. As an example, the Linux kernel 5.11, released in 2021, has around 30 million lines of code, roughly 14% of which are considered the ”core” part (
mm directories). Consequently, even with extensive testing and validation, the Linux 5.11 core code alone would contain approximately 2,100-12,600 bugs.
And this is only the operating system core without any applications. As of July 2022, the popular Apache HTTP server consists of about 1.5 million lines of code. So, even assuming the low rate of 0.5-3 bugs per 1,000 lines of code, adding a web server to the core system would account for another 750-4,500 bugs.
Figure 1.7: Increase of Linux kernel size over the years
What is even more concerning is the rate of bugs doesn’t seem to improve significantly enough over time to cope with the increasing software size. The extensively tested software having 0.5-3 bugs per 1,000 lines of code mentioned above was reported by Myers in 1986 . On the other hand, a study performed by Carnegie Mellon University’s CyLab institute in 2004 identified 0.17 bugs per 1,000 lines of code in the Linux 2.6 kernel, a total of 985 bugs, of which 627 were in critical parts of the kernel. This amounts to slightly more than halving the bug rate at best – over almost 20 years.
Clearly, in that same period of time from 1986 to 2004 the size of typical software has more than doubled. As an example, Linux version 1.0, released in 1994, had about 170,000 lines of code. In comparison, Linux kernel 2.6, which was released in 2003, already had 8.1 million lines of code. This is approximately a 47-fold increase in size within less than a decade.
Figure 1.8: Reported security vulnerabilities per year
1.5 Example attacks
1.5.1 The Mirai botnet
In late 2016, the internet was hit by a series of massive Distributed Denial-of-Service (DDoS) attacks originating from the Mirai botnet, a large collection of infected devices (so-called bots) remote-controlled by attackers.
The early history of the Mirai botnet can be found in : the first bootstrap scan on August 1 lasted about two hours and infected 834 devices. This initial population continued to scan for new members and within 20 hours, another 64,500 devices were added to the botnet. The infection campaign continued in September, when about 300,000 devices were infected, and reached its peak of 600,000 bots by the end of November. This corresponds to a rate of 2.2-3.4 infected devices per minute or 17.6-27.2 seconds to infect a single device.
Now contrast this with a side-channel or fault attack. Even if we assume that the actual attack – that is, the measurement and processing of the side-channel traces or the injection of a fault – can be carried out in zero time, an attacker would still need time to gain physical access to each target. Now suppose that, on average, the attacker needs one hour to physically access a target (actually, this is a very optimistic assumption from the attacker’s perspective, given that the targets are distributed throughout the globe). In that case, attacking 200,000-300,000 devices would take approximately 22-33 years or 270 to 400 months (as opposed to 2 months in the case of Mirai).
Moreover, any remote attack starts at a network interface of the target system. So the first (and, oftentimes, the only) thing the attacker interacts with is software. But software is complex by nature.
1.5.2 Operation Aurora
In mid-December 2009, Google discovered a highly sophisticated, targeted attack on their corporate infrastructure that resulted in intellectual property theft . During their investigation, Google discovered that at least 20 other large companies from a wide range of businesses had been targeted in a similar way .
This series of cyberattacks came to be known as Operation Aurora  and were attributed to APT groups based in China. The name was coined by McAfee Labs security researchers based on their discovery that the word Aurora could be found in a file on the attacker’s machine that was later included in malicious binaries used in the attack as part of a file path. Typically, such a file path is inserted by the compiler into the binary to indicate where debug symbols and source code can be found on the developer’s machine. McAfee Labs therefore hypothesized that Aurora could be the name of the operation used by the attackers .
According to McAfee, the main target of the attack was source code repositories at high-tech, security, and defense contractor companies. If these repositories were modified in a malicious way, the attack could be spread further to their client companies. Operation Aurora can therefore be considered the first major attack on software supply chains .
In response to Aurora, Google shut down its operations in China four months after the incident and migrated away from a purely perimeter-based defense principle. This means devices are not trusted by default anymore, even if they are located within a corporate LAN .
1.5.3 The Jeep hack
At the BlackHat 2015 conference, security researchers Charlie Miller and Chris Valasek demonstrated the first remote attack on an unaltered, factory passenger car . In what later became known as the Jeep hack, the researchers demonstrated how the vehicle’s infotainment system, Uconnect, which has both remote connectivity as well as the capability to communicate with other electronic control units within the vehicle, can be used for remote attacks.
Specifically, while systematically examining the vehicle’s attack surface, the researchers discovered an open D-Bus over an IP port on Uconnect, which is essentially an inter-process communication and remote procedure call mechanism. The D-Bus service accessible via the open port allows anyone connected to the infotainment system to execute arbitrary code in an unauthenticated manner.
Miller and Valasek also discovered that the D-Bus port was bound to all network interfaces on the vulnerable Uconnect infotainment system and was therefore accessible remotely over the Sprint mobile network that Uconnect uses for telematics. By connecting to the Sprint network using a femtocell or simply a regular mobile phone, the researchers were able to send remote commands to the vehicle.
From that entry point, Miller and Valasek attacked a chip in the vehicle’s infotainment system by re-writing its firmware to be able to send arbitrary commands over the vehicle’s internal CAN communication network, effectively giving them the ability to completely take over the vehicle.
What do these examples have in common and how does it relate to cryptography? In a nutshell, these examples illustrate what happens in the absence of appropriate cryptography. In all three cases discussed, there was no mechanism in place to verify that the systems were talking to legitimate users and that the messages received were not manipulated while in transit.
In the Mirai example, anyone with knowledge of the IoT devices’ IP addresses would have been able to access their login page. This information can be easily collected by scanning the public internet with tools such as
nmap. So the designers’ assumption that the users would change the default device password to a strong individual one was the only line of defense. What the security engineers should have done instead is to add a cryptographic mechanism to give access to the login procedure only to legitimate users, for example, users in possession of a digital certificate or a private key.
In the case of Operation Aurora, the perimeter defense doctrine used by the affected companies treated every device within the trusted perimeter (typically, within a corporate network) as trustworthy by default. On this premise, every device inside the perimeter had access to all resources and systems within that perimeter.
As a result, anyone able to walk inside a company building or trick an arbitrary employee into clicking on a malicious link and infect their computer with malware would have been able to access all systems within the perimeter.
As a response to Operation Aurora, Google and other companies replaced perimeter defense with a zero trust security model that establishes trust by evaluating it on a per-transaction basis instead of basing trust on the network location (the perimeter) . At the core of the zero trust security model is the ability to securely authenticate users and resources in order to prevent unauthorized access to data and services. Secure authentication, in turn, is built upon cryptography.
Finally, in the Jeep hack example, the open D-Bus over IP port allowed anyone connected to the vehicle’s infotainment system to execute arbitrary code in an unauthenticated manner. The possibility to access the vehicle remotely over the Sprint mobile network further increased the range of the attack. The system’s designers apparently assumed that the Sprint mobile network is a secure perimeter. What they should have done instead is to add a cryptographic mechanism to ensure that only legitimate users could log in to the Uconnect system.
In this chapter, we have provided an overview of the recent history of cryptography, starting in the 1970s, and identified some global trends that explain why cryptography has become more and more important over the last few decades, to a point where it is practically around you every time you access the internet or use a connected device. In the next chapter, you will learn about the general goals and objectives you can achieve with the help of cryptography. In particular, you will get to know cryptography’s main protagonists, Alice and Bob, and their ubiquitous opponents, Eve and Mallory.