Search icon CANCEL
Cart icon
Close icon
You have no products in your basket yet
Save more on your purchases!
Savings automatically calculated. No voucher code required
Arrow left icon
All Products
Best Sellers
New Releases
Learning Hub
Free Learning
Arrow right icon
Embedded Systems Architecture - Second Edition
Embedded Systems Architecture - Second Edition

Embedded Systems Architecture: Design and write software for embedded devices to build safe and connected systems, Second Edition

$37.99 $25.99
$15.99 Monthly

What do you get with eBook?

Product feature icon Instant access to your Digital eBook purchase
Product feature icon Download this book in EPUB and PDF formats
Product feature icon AI Assistant (beta) to help accelerate your learning
Product feature icon Access this title in our online reader with advanced features
Product feature icon DRM FREE - Read whenever, wherever and however you want
Buy Now
Table of content icon View table of contents Preview book icon Preview Book

Embedded Systems Architecture - Second Edition

Embedded Systems – A Pragmatic Approach

Designing and writing software for embedded systems poses a different set of challenges than traditional high-level software development.

This chapter provides an overview of these challenges and introduces the basic components and the platform that will be used as a reference in this book.

In this chapter, we will discuss the following topics:

  • Domain definition
  • General-purpose input/output (GPIO)
  • Interfaces and peripherals
  • Connected systems
  • Introduction to isolation mechanisms
  • The reference platform

Domain definition

Embedded systems are computing devices that perform specific, dedicated tasks with no direct or continued user interaction. Due to the variety of markets and technologies, these objects have different shapes and sizes, but often, all have a small size and a limited amount of resources.

In this book, the concepts and the building blocks of embedded systems will be analyzed through the development of the software components that interact with their resources and peripherals. The first step is to define the scope for the validity of the techniques and the architectural patterns explained in this book, within the broader definition of embedded systems.

Embedded Linux systems

One part of the embedded market relies on devices with enough power and resources to run a variant of the GNU/Linux OS. These systems, often referred to as embedded Linux, are outside the scope of this book, as their development includes different strategies of design and integration of the components. A typical hardware platform that is capable of running a system based on the Linux kernel is equipped with a reasonably large amount of RAM, up to a few gigabytes, and sufficient storage space on board to store all the software components provided in the GNU/Linux distribution.

Additionally, for the Linux memory management to provide separate virtual address spaces to each process on the system, the hardware must be equipped with a memory management unit (MMU), a hardware component that assists the OS in translating physical addresses into virtual addresses, and vice versa, at runtime.

This class of devices presents different characteristics that are often overkill for building tailored solutions, which can use a much simpler design and reduce the production costs of single units.

Hardware manufacturers and chip designers have researched new techniques to improve the performance of microcontroller-based systems. In the past few decades, they have introduced new generations of platforms that would cut hardware costs, firmware complexity, size, and power consumption to provide a set of features that are most interesting for the embedded market.

Due to their specifications, in some real-life scenarios, embedded systems must be able to execute a series of tasks within a short, measurable, and predictable amount of time. These kinds of systems are called real-time systems and differ from the approach of multi-task computing, which is used in desktops, servers, and mobile phones.

Real-time processing is a goal that is extremely hard, if not impossible, to reach on embedded Linux platforms. The Linux kernel is not designed for hard real-time processing, and even if patches are available to modify the kernel scheduler to help meet these requirements, the results are not comparable to bare-metal, constrained systems that are designed with this purpose in mind.

Some other application domains, such as battery-powered and energy-harvesting devices, can benefit from the low power consumption capabilities of smaller embedded devices and the energy efficiency of the wireless communication technologies often integrated into embedded connected devices. The higher amount of resources and the increased hardware complexity of Linux-based systems often do not scale down enough on energy levels or require effort to meet similar figures in power consumption.

The type of microcontroller-based systems that we will analyze in this book is 32-bit systems, which are capable of running software in a single-threaded, bare-metal application, as well as integrating minimalist real-time OSs, which are very popular in the industrial manufacturing of embedded systems, which we use daily to accomplish specific tasks. They are becoming more and more adopted to help define more generic, multiple-purpose development platforms.

Low-end 8-bit microcontrollers

In the past, 8-bit microcontrollers dominated the embedded market. The simplicity of their design allows us to write small applications that can accomplish a set of predefined tasks but are too simple and usually equipped with too few resources to implement an embedded system, especially since 32-bit microcontrollers have evolved to cover all the use cases for these devices within the same range of price, size, and power consumption.

Nowadays, 8-bit microcontrollers are mostly relegated to the market of educational platform kits, aimed at introducing hobbyists and newcomers to the basics of software development on electronic devices. 8-bit platforms are not covered in this book because they lack the characteristics that allow advanced system programming, multithreading, and advanced features to be developed to build professional embedded systems.

In the context of this book, the term embedded systems is used to indicate a class of systems running on microcontroller-based hardware architecture, offering constrained resources but allowing real-time systems to be built through features provided by the hardware architecture to implement system programming.

Hardware architecture

The architecture of an embedded system is centered around its microcontroller, also sometimes referred to as the microcontroller unit (MCU). This is typically a single integrated circuit containing the processor, RAM, flash memory, serial receivers and transmitters, and other core components. The market offers many different choices among architectures, vendors, price ranges, features, and integrated resources. These are typically designed to be inexpensive, low-resource, low-energy consuming, self-contained systems on a single integrated circuit, which is the reason why they are often referred to as System-on-Chip (SoC).

Due to the variety of processors, memories, and interfaces that can be integrated, there is no actual reference architecture for microcontrollers. Nevertheless, some architectural elements are common across a wide range of models and brands, and even across different processor architectures.

Some microcontrollers are dedicated to specific applications and expose a particular set of interfaces to communicate to peripherals and the outside world. Others are focused on providing solutions with reduced hardware costs, or with very limited energy consumption.

Nevertheless, the following set of components is hardcoded into almost every microcontroller:

  • Microprocessor
  • RAM
  • Flash memory
  • Serial transceivers

Additionally, more and more devices are capable of accessing a network, to communicate with other devices and gateways. Some microcontrollers may provide either well-established standards, such as Ethernet or Wi-Fi interfaces, or specific protocols specifically designed to meet the constraints of embedded systems, such as sub-GHz radio interfaces or a Controller Area Network (CAN) bus, being partially or fully implemented within the IC.

All the components must share a bus line with the processor, which is responsible for coordinating the logic. The RAM, flash memory, and control registers of the transceivers are all mapped in the same physical address space:

Figure 1.1 – A simplified block diagram of the components inside a generic microcontroller

Figure 1.1 – A simplified block diagram of the components inside a generic microcontroller

The addresses where RAM and Flash Memory are mapped depend on the specific model and are usually provided in the datasheet. A microcontroller can run code in its native machine language; that is, a sequence of instructions conveyed into a binary file that is specific to the architecture it is running on. By default, compilers provide a generic executable file as the output of the compilation and assembly operations, which needs to be converted into a format that can be executed by the target.

The Processor part is designed to execute the instructions that have been stored in its own specific binary format directly from RAM as well as from its internal flash memory. This is usually mapped starting from position zero in memory or another well-known address specified in the microcontroller manual. The CPU can fetch and execute code from RAM faster, but the final firmware is stored in the flash memory, which is usually bigger than the RAM on almost all microcontrollers and permits it to retain the data across power cycles and reboots.

Compiling a software operating environment for an embedded microcontroller and loading it onto the flash memory requires a host machine, which is a specific set of hardware and software tools. Some knowledge about the target device’s characteristics is also needed to instruct the compiler to organize the symbols within the executable image. For many valid reasons, C is the most popular language in embedded software, although not the only available option. Higher-level languages, such as Rust and C++, can produce embedded code when combined with a specific embedded runtime, or even in some cases by entirely removing the runtime support from the language.


This book will focus entirely on C code because it abstracts less than any other high-level language, thus making it easier to describe the behavior of the underlying hardware while looking at the code.

All modern embedded systems platforms also have at least one mechanism (such as JTAG) for debugging purposes and uploading the software to the flash. When the debugging interface is accessed from the host machine, a debugger can interact with the breakpoint unit in the processor, interrupting and resuming the execution, and can also read and write from any address in memory.

A significant part of embedded programming is communicating the peripherals while using the interfaces that the MCU exposes. Embedded software development requires basic knowledge of electronics, the ability to understand schematics and datasheets, and confidence with the measurement tools, such as logic analyzers or oscilloscopes.

Understanding the challenges

Approaching embedded development means keeping the focus on the specifications as well as the hardware restrictions at all times. Embedded software development is a constant challenge that requires focusing on the most efficient way to perform a set of specific tasks but keeping the limited resources available in strong consideration. There are several compromises to deal with, which are uncommon in other environments. Here are some examples:

  • There might be not enough space in the flash to implement a new feature
  • There might not be enough RAM to store complex structures or make copies of large data buffers
  • The processor might be not fast enough to accomplish all the required calculations and data processing in due time
  • Battery-powered and resource-harvesting devices might require lower energy consumption to meet lifetime expectations

Moreover, PC and mobile OSs make large use of the MMU, a component of the processor that allows runtime translations between physical and virtual addresses.

The MMU is a necessary abstraction to implement address space separation among the tasks, as well as between the tasks and the kernel itself. Embedded microcontrollers do not have an MMU, and usually lack the amount of non-volatile memory required to store kernels, applications, and libraries. For this reason, embedded systems are often running in a single task, with the main loop performing all the data processing and communication in a specific order. Some devices can run embedded OSs, which are far less complex than their PC counterparts.

Application developers often see the underlying system as a commodity, while embedded development often means that the entire system has to be implemented from scratch, from the boot procedure up to the application logic. In an embedded environment, the various software components are more closely related to each other because of the lack of more complex abstractions, such as memory separations between the processes and the OS kernel.

A developer approaching embedded systems for the first time might find testing and debugging on some of the systems a bit more intricate than just running the software and reading out the results. This becomes especially true in those systems that have been designed with little or no human interaction interfaces.

A successful approach requires a healthy workflow, which includes well-defined test cases, a list of key performance indicators coming from the analysis of the specifications to identify possibilities of trade-offs, several tools and procedures at hand to perform all the needed measurements, and a well-established and efficient prototyping phase.

In this context, security deserves some special consideration. As usual, when writing code at the system level, it is wise to keep in mind the system-wide consequences of possible faults. Most embedded application code runs with extended privileges on the hardware, and a single task misbehaving can affect the stability and integrity of the entire firmware. As we will see, some platforms offer specific memory protection mechanisms and built-in privilege separation, which are useful for building fail-safe systems, even in the absence of a full OS based on separating process address spaces.


One of the advantages of using microcontrollers designed to build embedded systems is the possibility to run logically separated tasks within separate execution units by time-sharing the resources.

The most popular type of design for embedded software is based on a single loop-based sequential execution model, where modules and components are connected to expose callback interfaces. However, modern microcontrollers offer features and core logic characteristics that can be used by system developers to build a multitasking environment to run logically separated applications.

These features are particularly handy in the approach to more complex real-time systems, and they help us understand the possibilities of the implementation of safety models based on process isolation and memory segmentation.


“640 KB of memory ought to be enough for everyone”

– Bill Gates (founder and former director of Microsoft)

This famous statement has been cited many times in the past three decades to underline the progress in technology and the outstanding achievements of the PC industry. While it may sound like a joke for many software engineers out there, it is still in these figures that embedded programming has to be thought about, more than 30 years after MS-DOS was initially released.

Although most embedded systems are capable of breaking that limit today, especially due to the availability of external DRAM interfaces, the simplest devices that can be programmed in C may have as little as 4 KB of RAM available to implement the entire system logic. This has to be taken into account when designing an embedded system, by estimating the amount of memory potentially needed for all the operations that the system has to perform, and the buffers that may be used at any time to communicate with peripherals and nearby devices.

The memory model at the system level is simpler than that of PCs and mobile devices. Memory access is typically done at the physical level, so all the pointers in your code are telling you the physical location of the data they are pointing to. In modern computers, the OS is responsible for translating physical addresses into a virtual representation of the running tasks.

The advantage of the physical-only memory access on those systems that do not have an MMU is the reduced complexity of having to deal with address translations while coding and debugging. On the other hand, some of the features that are implemented by any modern OS, such as process swapping and dynamically resizing address spaces through memory relocation, become cumbersome and sometimes impossible.

Handling memory is particularly important in embedded systems. Programmers who are used to writing application code expect a certain level of protection to be provided by the underlying OS. A virtual address space does not allow memory areas to overlap, and the OS can easily detect unauthorized memory accesses and segmentation violations, so it promptly terminates the process and avoids having the whole system compromised.

On embedded systems, especially when writing bare-metal code, the boundaries of each address pool must be checked manually. Accidentally modifying a few bits in the wrong memory, or even accessing a different area of memory, may result in a fatal, irrevocable error. The entire system may hang, or, in the worst case, become unpredictable. A safe approach is required when handling memory in embedded systems, in particular when dealing with life-critical devices. Identifying memory errors too late in the development process is complex and often requires more resources than forcing yourself to write safe code and protecting the system from a programmer’s mistakes.

Proper memory-handling techniques will be explained in Chapter 5, Memory Management.

Flash memory

In a server or a personal computer, the executable applications and libraries reside in storage devices. At the beginning of the execution, they are accessed, transformed, possibly uncompressed, and stored in RAM before the execution starts.

The firmware of an embedded device is, in general, one single binary file containing all the software components, which can be transferred to the internal flash memory of the MCU. Since the flash memory is directly mapped to a fixed address in the memory space, the processor is capable of sequentially fetching and executing single instructions from it with no intermediate steps. This mechanism is called execute in place (XIP).

All non-modifiable sections on the firmware do not need to be loaded in memory and are accessible through direct addressing in the memory space. This includes not only the executable instructions but also all the variables that are marked as constant by the compiler. On the other hand, supporting XIP requires a few extra steps when preparing the firmware image to be stored in flash, and the linker needs to be instructed about the different memory-mapped areas on the target.

The internal flash memory that is mapped in the address space of the microcontroller is not accessible for writing. Altering the content of the internal flash can be done only by using block-based access, due to the hardware characteristics of flash memory devices. Before changing the value of a single byte in flash memory, the whole block containing it must be erased and rewritten. The mechanism offered by most manufacturers to access block-based flash memory for writing is known as In-Application Programming (IAP). Some filesystem implementations take care of abstracting write operations on a block-based flash device, by creating a temporary copy of the block where the write operation is performed.

While selecting the components for a microcontroller-based solution, it is vital to match the size of the flash memory to the space required by the firmware. The flash is often one of the most expensive components in the MCU, so for deployment on a large scale, choosing an MCU with a smaller flash could be more cost-effective. Developing software with code size in mind is not very usual nowadays within other domains, but it may be required when trying to fit multiple features in such little storage. Finally, compiler optimizations may exist on specific architectures to reduce code size when building the firmware and linking its components.

Additional non-volatile memories that reside outside of the MCU silicon can typically be accessed using specific interfaces, such as the Serial Peripheral Interface. External flash memories use different technologies than internal flash, which is designed to be fast and execute code in place. While being generally more dense and less expensive, external flash memories do not allow direct memory mapping in the physical address space, which makes them unsuitable for storing firmware images. This is because it would be impossible to execute the code fetching the instructions sequentially unless a mechanism is used to load the executable symbols in RAM – read access on these kinds of devices is performed one block at a time. On the other hand, write access may be faster compared to IAP, making these kinds of non-volatile memory devices ideal for storing data that is retrieved at runtime in some designs.

General-purpose input/output (GPIO)

The most basic functionality that can be achieved with any microcontroller is the possibility to control signals on specific pins of the integrated circuit. The microcontroller can turn a digital output on or off, which corresponds to a reference voltage to be applied to the pin when the value assigned to it is 1, and zero volts when the value is 0. In the same way, a pin can be used to detect a 1 or a 0 when the pin is configured as input. The software will read the digital value “1” when the voltage applied to it is higher than a certain threshold.


Some chips have onboard ADC controllers, which are capable of sensing the voltage that is applied to the pin and sampling it. This is often used to acquire measurements from input peripherals providing a variable voltage as output. The embedded software will be able to read the voltage, with an accuracy that depends on the predefined range.

A DAC controller is the inverse of an ADC controller, transforming a value on a microcontroller register into the corresponding voltage.

Timers and PWM

Microcontrollers may offer diverse ways to measure time. Often, there is at least one interface based on a countdown timer that can trigger an interrupt and automatically reset upon expiry.

GPIO pins configured as output can be programmed to output a square wave with a preconfigured frequency and duty cycle. This is called pulse-wave modulation (PWM) and has several uses, from controlling output peripherals to dimming an LED or even playing an audible sound through a speaker.

More details about GPIO, interrupt timers, and watchdogs will be explored in Chapter 6, General-Purpose Peripherals.

Interfaces and peripherals

To communicate with peripherals and other microcontrollers, several de facto standards are well established in the embedded world. Some of the external pins of the microcontroller can be programmed to carry out communication with external peripherals using specific protocols. A few of the common interfaces available on most architectures are as follows:

  • Asynchronous UART-based serial communication
  • Serial Peripheral Interface (SPI) bus
  • Inter-integrated circuit (I2C) bus
  • Universal Serial Bus (USB)

Let’s review each in detail.

Asynchronous UART-based serial communication

Asynchronous communication is provided by the Universal Asynchronous Receiver-Transmitter (UART). These kinds of interfaces, commonly known as serial ports, are called asynchronous because they do not need to share a clock signal to synchronize the sender and the receiver, but rather work on predefined clock rates that can be aligned while the communication is ongoing. Microcontrollers may contain multiple UARTs that can be attached to a specific set of pins upon request. Asynchronous communication is provided by UART as a full-duplex channel, through two independent wires, connecting the RX pin of each endpoint to the TX pin on the opposite side.

To understand each other, the systems at the two endpoints must set up the UART using the same parameters. This includes the framing of the bytes on the wire and the frame rate. All of these parameters have to be known in advance by both endpoints to correctly establish a communication channel. Despite being simpler than the other types of serial communication, UART-based serial communication is still widely used in electronic devices, particularly as an interface for modems and GPS receivers. Furthermore, using TTL-to-USB serial converters, it is easy to connect a UART to a console on the host machine, which is often handy for providing log messages.


A different approach to classic UAR—based communication is SPI. Introduced in the late 1980s, this technology aimed to replace asynchronous serial communication toward peripherals by introducing several improvements:

  • Serial clock line to synchronize the endpoints
  • Master-slave protocol
  • One-to-many communication over the same three-wire bus

The master device, usually the microcontroller, shares the bus with one or more slaves. To trigger the communication, a separate slave select (SS) signal is used to address each slave connected to the bus. The bus uses two independent signals for data transfer, one per direction, and a shared clock line that synchronizes the two ends of the communication. Due to the clock line being generated by the master, the data transfer is more reliable, making it possible to achieve higher bitrates than ordinary UART. One of the keys to the continued success of SPI over multiple generations of microcontrollers is the low complexity required for the design of slaves, which can be as simple as a single shift register. SPI is commonly used in sensor devices, LCDs, flash memory controllers, and network interfaces.


I2C is slightly more complex, and that is because it is designed with a different purpose in mind: interconnecting multiple microcontrollers, as well as multiple slave devices, on the same two-wire bus. The two signals are serial clock (SCL) and serial data (SDA). Unlike SPI or UART, the bus is half-duplex, as the two directions of the flow share the same signal. Thanks to a 7-bit slave-addressing mechanism incorporated in the protocol, it does not require additional signals dedicated to selecting the slaves. Multiple masters are allowed on the same line, given that all the masters in the system follow the arbitration logic in the case of bus contention.


The USB protocol, originally designed to replace UART and include many protocols in the same hardware connector, is very popular in personal computers, portable devices, and a huge number of peripherals.

This protocol works in host-device mode, with one side of communication, the device, exposing services that can be used by the controller, on the host side. USB transceivers present in many microcontrollers can work in both modes. By implementing the upper layer of the USB standards, different types of devices can be emulated by the microcontroller, such as serial ports, storage devices, and point-to-point Ethernet interfaces, creating microcontroller-based USB devices that can be connected to a host system.

If the transceiver supports host mode, the embedded system can act as a USB host, and devices can be connected to it. In this case, the system should implement device drivers and applications to access the functionality provided by the device.

When both modes are implemented on the same USB controller, the transceiver works in on-the-go (OTG) mode, and selecting and configuring the desired mode can be done at runtime.

A more extended introduction to some of the most common protocols used for communicating with peripherals and neighboring systems will be provided in Chapter 7, Local Bus Interfaces.

Connected systems

An increasing number of embedded devices designed for different markets are now capable of network communication with their peers in the surrounding area or with gateways routing their traffic to a broader network or the internet. The term Internet of Things (IoT) has been used to describe the networks where those embedded devices can communicate using internet protocols.

This means that IoT devices can be addressed within the network in the same way as more complex systems, such as PCs or mobile devices, and most importantly, they use the transport layer protocols typical of internet communications to exchange data. TCP/IP is a suite of protocols standardized by the IETF, and it is the fabric of the infrastructure for the internet and other self-contained, local area networks.

The Internet Protocol (IP) provides network connectivity, but on the condition that the underlying link provides packet-based communication and mechanisms to control and regulate access to the physical media. Fortunately, many network interfaces meet these requirements. Alternative protocol families, which are not compatible with TCP/IP, are still in use in several distributed embedded systems, but a clear advantage of using the TCP/IP standard on the target is that, in the case of communication with non-embedded systems, there is no need for a translation mechanism to route the frames outside the scope of the LAN.

Besides the types of links that are widely used in non-embedded systems, such as Ethernet or wireless LAN, embedded systems can benefit from a wide choice of technologies that are specifically designed for the requirements introduced by IoT. New standards have been researched and put into effect to provide efficient communication for constrained devices, defining communication models to cope with specific resource usage limits and energy efficiency requirements.

Recently, new link technologies have been developed in the direction of lower bitrates and power consumption for wide-area network communication. These protocols are designed to provide narrow-band, long-range communication. The frame is too small to fit IP packets, so these technologies are mostly employed to transmit small payloads, such as periodic sensor data, or device configuration parameters if a bidirectional channel is available, and they require some form of gateway to translate the communication so that it can travel across the internet.

The interaction with the cloud services, however, requires, in most cases, connecting all the nodes in the network, and implementing the same technologies used by the servers and the IT infrastructure directly in the host. Enabling TCP/IP communication on an embedded device is not always straightforward. Even though there are several open source implementations available, system TCP/IP code is complex, big in size, and often has memory requirements that may be difficult to meet.

The same observation applies to the Secure Socket Layer (SSL)/Transport Layer Security (TLS) library, which adds confidentiality and authentication between the two communication endpoints. Choosing the right microcontroller for the task is, again, crucial, and if the system has to be connected to the internet and support secure socket communication, then the flash and RAM requirements have to be updated in the design phase to ensure integration with third-party libraries.

Challenges of distributed systems

Designing distributed embedded systems, especially those that are based on wireless link technologies, adds a set of interesting challenges.

Some of these challenges are related to the following aspects:

  • Selecting the correct technologies and protocols
  • Limitations on bitrate, packet size, and media access
  • Availability of the nodes
  • Single points of failure in the topology
  • Configuring the routes
  • Authenticating the hosts involved
  • The confidentiality of the communication over the media
  • The impact of buffering on network speed, latency, and RAM usage
  • The complexity of implementing the protocol stacks

Chapter 9, Distributed Systems and IoT Architecture, analyzes some of the link-layer technologies implemented in embedded systems to provide remote communication, where TCP/IP communication is integrated into the design of distributed systems that are integrated with IoT services.

Introduction to isolation mechanisms

Some newer microcontrollers include support for isolation between trusted and non-trusted software running onboard. This mechanism is based on a CPU extension, available only on some specific architectures, which usually relies on a sort of physical separation inside the CPU itself between the two modes of execution. All the code running from a non-trusted zone in the system will have a restricted view of the RAM, devices, and peripherals, which must be dynamically configured by the trusted counterpart in advance.

Software running from the trusted area can also provide features that are not directly accessible from the non-trusted world, through special function calls that cross the secure/non-secure boundary.

Chapter 11, Trusted Execution Environment, explores the technology behind Trust Execution Environments (TEEs), as well as the software components involved in real embedded systems to provide a safe environment to run non-trusted modules and components.

The reference platform

The preferred design strategy for embedded CPU cores is reduced instruction set computer (RISC). Among all the RISC CPU architectures, several reference designs are used as guidelines by silicon manufacturers to produce the core logic to integrate into the microcontroller. Each reference design differs from the others in several characteristics of the CPU implementation. Each reference design includes one or more families of microprocessors integrated into embedded systems, which share the following characteristics:

  • Word size used for registers and addresses (8-bit, 16-bit, 32-bit, or 64-bit)
  • Instruction set
  • Register configurations
  • Endianness
  • Extended CPU features (interrupt controller, FPU, MMU)
  • Caching strategies
  • Pipeline design

Choosing a reference platform for your embedded system depends on your project needs. Smaller, less feature-rich processors are generally more suited to low energy consumption, have a smaller MCU packaging, and are less expensive. Higher-end systems, on the other hand, come with a bigger set of resources and some of them have dedicated hardware to cope with challenging calculations (such as a floating-point unit, or an Advanced Encryption Standard (AES) hardware module to offload symmetric encryption operations). 8-bit and 16-bit core designs are slowly giving way to 32-bit architectures, but some successful designs remain relatively popular in some niche markets and among hobbyists.

ARM reference design

ARM is the most ubiquitous reference design supplier in the embedded market, with more than 10 billion ARM-based microcontrollers produced for embedded applications. One of the most interesting core designs in the embedded industry is the ARM Cortex-M family, which includes a range of models scaling from cost-effective and energy-efficient, to high-performance cores specifically designed for multimedia microcontrollers. Despite ranging among three different instruction sets (ARMv6, ARMv7, and ARMv8), all Cortex-M CPUs share the same programming interface, which improves portability across microcontrollers in the same families.

Most of the examples in this book will be based on this family of CPUs. Though most of the concepts expressed will apply to other core designs as well, picking a reference platform now opens the door to a more complete analysis of the interactions with the underlying hardware. In particular, some of the examples in this book use specific assembly instructions from the ARMv7 instruction set, which is implemented in some Cortex-M CPU cores.

The Cortex-M microprocessor

The main characteristic of the 32-bit cores in the Cortex-M family are as follows:

  • 16 generic-purpose CPU registers
  • Thumb 16-bit only instructions for code density optimizations
  • A built-in Nested Vector Interrupt Controller (NVIC) with 8 to 16 priority levels
  • ARMv6-M (M0, M0+), ARMv7-M (M3, M4, M7), or ARMv8-M (M23, M33) architecture
  • Optional 8-region memory protection unit (MPU)
  • Optional TEE isolation mechanism (ARM TrustZone-M)

The total memory address space is 4 GB. The beginning of the internal RAM is typically mapped at the fixed address of 0x20000000. The mapping of the internal flash, as well as the other peripherals, depends on the silicon manufacturer. However, the highest 512 MB (0xE0000000 to 0xFFFFFFFF) addresses are reserved for the System Control Block (SCB), which groups together several configuration parameters and diagnostics that can be accessed by the software at any time to directly interact with the core.

Synchronous communication with peripherals and other hardware components can be triggered through interrupt lines. The processor can receive and recognize several different digital input signals and react to them promptly, interrupting the execution of the software and temporarily jumping to a specific location in the memory. Cortex-M supports up to 240 interrupt lines on the high-end cores of the family.

The interrupt vector, located at the beginning of the software image in flash, contains the addresses of the interrupt routines that will automatically execute on specific events. Thanks to the NVIC, interrupt lines can be assigned priorities so that when a higher-priority interrupt occurs while the routine for a lower interrupt is executed, the current interrupt routine is temporarily suspended to allow the higher-priority interrupt line to be serviced. This ensures minimal interrupt latency for these signal lines, which are somewhat critical for the system to execute as fast as possible.

At any time, the software on the target can run in two privilege modes: unprivileged or privileged. The CPU has built-in support for privilege separation between system and application software, even providing two different registers for the two separate stack pointers. In Chapter 10, Parallel Tasks and Scheduling, we will examine how to properly implement privilege separation, as well as how to enforce memory separation when running untrusted code on the target, in more detail. This is, for example, used to hide secrets such as private keys from direct access from the non-secure world. In Chapter 11, Trusted Execution Environment, we will learn how to properly implement privilege separation, as well as how to enforce memory separation within an OS when running application code on the target with a different level of trust.

A Cortex-M core is present in many microcontrollers, from different silicon vendors. Software tools are similar for all the platforms, but each MCU has a different configuration to take into account. Convergence libraries are available to hide manufacturer-specific details and improve portability across different models and brands. Manufacturers provide reference kits and all the documentation required to get started, which are intended to be used for evaluation during the design phase, and may also be useful for developing prototypes at a later stage. Some of these evaluation boards are equipped with sensors, multimedia electronics, or other peripherals that extend the functionality of the microcontroller. Some even include preconfigured, third-party “middleware” libraries such as TCP/IP communication stacks, TLS and cryptography libraries, simple filesystems and other accessory components, and modules that can be quickly and easily added to a software project.


When approaching embedded software requirements, before anything else, you must have a good understanding of the hardware platform and its components. By describing the architecture of modern microcontrollers, this chapter pointed out some of the peculiarities of embedded devices and how developers should efficiently rethink their approach to meeting requirements and solving problems, while at the same time taking into account the features and the limits of the target platform.

In the next chapter, we will analyze the tools and procedures typically used in embedded development, including command-line toolchains and integrated development environments (IDEs). We will understand how to organize the workflow and how to effectively prevent, locate, and fix bugs.

Left arrow icon Right arrow icon
Download code icon Download Code

Key benefits

  • Identify and overcome challenges in embedded environments
  • Understand and implement the steps required to increase the security of IoT solutions
  • Build safety-critical and memory-safe parallel and distributed embedded systems


Embedded Systems Architecture begins with a bird’s-eye view of embedded development and how it differs from the other systems that you may be familiar with. This book will help you get the hang of the internal working of various components in real-world systems. You’ll start by setting up a development environment and then move on to the core system architectural concepts, exploring system designs, boot-up mechanisms, and memory management. As you progress through the topics, you’ll explore the programming interface and device drivers to establish communication via TCP/IP and take measures to increase the security of IoT solutions. Finally, you’ll be introduced to multithreaded operating systems through the development of a scheduler and the use of hardware-assisted trusted execution mechanisms. With the help of this book, you will gain the confidence to work with embedded systems at an architectural level and become familiar with various aspects of embedded software development on microcontrollers—such as memory management, multithreading, and RTOS—an approach oriented to memory isolation.

What you will learn

Participate in the design and definition phase of an embedded product Get to grips with writing code for ARM Cortex-M microcontrollers Build an embedded development lab and optimize the workflow Secure embedded systems with TLS Demystify the architecture behind the communication interfaces Understand the design and development patterns for connected and distributed devices in the IoT Master multitasking parallel execution patterns and real-time operating systems Become familiar with Trusted Execution Environment (TEE)

Product Details

Country selected

Publication date : Jan 13, 2023
Length 342 pages
Edition : 2nd Edition
Language : English
ISBN-13 : 9781803239545
Category :

What do you get with eBook?

Product feature icon Instant access to your Digital eBook purchase
Product feature icon Download this book in EPUB and PDF formats
Product feature icon AI Assistant (beta) to help accelerate your learning
Product feature icon Access this title in our online reader with advanced features
Product feature icon DRM FREE - Read whenever, wherever and however you want
Buy Now

Product Details

Publication date : Jan 13, 2023
Length 342 pages
Edition : 2nd Edition
Language : English
ISBN-13 : 9781803239545
Category :

Table of Contents

18 Chapters
Preface Chevron down icon Chevron up icon
1. Part 1 – Introduction to Embedded Systems Development Chevron down icon Chevron up icon
2. Chapter 1: Embedded Systems – A Pragmatic Approach Chevron down icon Chevron up icon
3. Chapter 2: Work Environment and Workflow Optimization Chevron down icon Chevron up icon
4. Part 2 – Core System Architecture Chevron down icon Chevron up icon
5. Chapter 3: Architectural Patterns Chevron down icon Chevron up icon
6. Chapter 4: The Boot-Up Procedure Chevron down icon Chevron up icon
7. Chapter 5: Memory Management Chevron down icon Chevron up icon
8. Part 3 – Device Drivers and Communication Interfaces Chevron down icon Chevron up icon
9. Chapter 6: General-Purpose Peripherals Chevron down icon Chevron up icon
10. Chapter 7: Local Bus Interfaces Chevron down icon Chevron up icon
11. Chapter 8: Power Management and Energy Saving Chevron down icon Chevron up icon
12. Chapter 9: Distributed Systems and IoT Architecture Chevron down icon Chevron up icon
13. Part 4 – Multithreading Chevron down icon Chevron up icon
14. Chapter 10: Parallel Tasks and Scheduling Chevron down icon Chevron up icon
15. Chapter 11: Trusted Execution Environment Chevron down icon Chevron up icon
16. Index Chevron down icon Chevron up icon
17. Other Books You May Enjoy Chevron down icon Chevron up icon

Customer reviews

Top Reviews
Rating distribution
Full star icon Full star icon Full star icon Full star icon Full star icon 5
(1 Ratings)
5 star 100%
4 star 0%
3 star 0%
2 star 0%
1 star 0%
Filter icon Filter
Top Reviews

Filter reviews by

Druilhe Jean-Louis Feb 8, 2024
Full star icon Full star icon Full star icon Full star icon Full star icon 5
Feefo Verified review Feefo image
Get free access to Packt library with over 7500+ books and video courses for 7 days!
Start Free Trial


How do I buy and download an eBook? Chevron down icon Chevron up icon

Where there is an eBook version of a title available, you can buy it from the book details for that title. Add either the standalone eBook or the eBook and print book bundle to your shopping cart. Your eBook will show in your cart as a product on its own. After completing checkout and payment in the normal way, you will receive your receipt on the screen containing a link to a personalised PDF download file. This link will remain active for 30 days. You can download backup copies of the file by logging in to your account at any time.

If you already have Adobe reader installed, then clicking on the link will download and open the PDF file directly. If you don't, then save the PDF file on your machine and download the Reader to view it.

Please Note: Packt eBooks are non-returnable and non-refundable.

Packt eBook and Licensing When you buy an eBook from Packt Publishing, completing your purchase means you accept the terms of our licence agreement. Please read the full text of the agreement. In it we have tried to balance the need for the ebook to be usable for you the reader with our needs to protect the rights of us as Publishers and of our authors. In summary, the agreement says:

  • You may make copies of your eBook for your own use onto any machine
  • You may not pass copies of the eBook on to anyone else
How can I make a purchase on your website? Chevron down icon Chevron up icon

If you want to purchase a video course, eBook or Bundle (Print+eBook) please follow below steps:

  1. Register on our website using your email address and the password.
  2. Search for the title by name or ISBN using the search option.
  3. Select the title you want to purchase.
  4. Choose the format you wish to purchase the title in; if you order the Print Book, you get a free eBook copy of the same title. 
  5. Proceed with the checkout process (payment to be made using Credit Card, Debit Cart, or PayPal)
Where can I access support around an eBook? Chevron down icon Chevron up icon
  • If you experience a problem with using or installing Adobe Reader, the contact Adobe directly.
  • To view the errata for the book, see and view the pages for the title you have.
  • To view your account details or to download a new copy of the book go to
  • To contact us directly if a problem is not resolved, use
What eBook formats do Packt support? Chevron down icon Chevron up icon

Our eBooks are currently available in a variety of formats such as PDF and ePubs. In the future, this may well change with trends and development in technology, but please note that our PDFs are not Adobe eBook Reader format, which has greater restrictions on security.

You will need to use Adobe Reader v9 or later in order to read Packt's PDF eBooks.

What are the benefits of eBooks? Chevron down icon Chevron up icon
  • You can get the information you need immediately
  • You can easily take them with you on a laptop
  • You can download them an unlimited number of times
  • You can print them out
  • They are copy-paste enabled
  • They are searchable
  • There is no password protection
  • They are lower price than print
  • They save resources and space
What is an eBook? Chevron down icon Chevron up icon

Packt eBooks are a complete electronic version of the print edition, available in PDF and ePub formats. Every piece of content down to the page numbering is the same. Because we save the costs of printing and shipping the book to you, we are able to offer eBooks at a lower cost than print editions.

When you have purchased an eBook, simply login to your account and click on the link in Your Download Area. We recommend you saving the file to your hard drive before opening it.

For optimal viewing of our eBooks, we recommend you download and install the free Adobe Reader version 9.