FPGA Programming for Beginners

Chapter 1: Introduction to FPGA Architectures and Xilinx Vivado

In the following chapter, we will be exploring Field Programmable Gate Arrays (FPGAs) and the underlying technology that creates them. This underlying technology allows companies such as Xilinx to produce a reprogrammable chip from an Application Specific Integrated Circuit (ASIC) process. We'll then learn how to use an FPGA for a simple task. Whether you want to accelerate mathematically complex operations such as machine learning or artificial intelligence, or simply want to do some projects for fun, such as retro computing or reproducing obsolete video game machines (https://github.com/MiSTer-devel/Main_MiSTer/wiki), this book will jumpstart your journey. There couldn't be a better time to get into this field than the present, even if only as a hobby. Development boards are cheap and plentiful, and vendors have started making their tools available for free for their low cost, smaller parts.

In this book, we are going to build some example designs to introduce you to FPGA development, culminating in a project that can drive a VGA monitor.

By the end of this chapter, you should have a good understanding of an FPGA and its components.

In this chapter, we are going to cover the following main topics:

What is an ASIC?
How does a company create an FPGA?
What makes up an FPGA?
How can we use the Xilinx Vivado tools to design, test, and implement an FPGA

What is an ASIC?

ASICs are the fundamental building blocks of modern electronics – your laptop or PC, TV, cell phone, digital watch, almost everything you use on a day-to-day basis. It is also the fundamental building block upon which the FPGA we will be looking at is built from. In short, an ASIC is a custom-built chip designed using the same language and methods we will be introducing in this book.

FPGAs came about as the technology to create ASICs followed Moore's law (Gordon E. Moore, Cramming more components onto integrated circuits, Electronics, Volume 38, Number 8 (https://newsroom.intel.com/wp-content/uploads/sites/11/2018/05/moores-law-electronics.pdf) – the idea that the number of transistors on a chip doubles every 2 years. This has allowed for both very cheap electronics in the case of mass-produced items containing ASICs and led to the proliferation of lower cost FPGAs.

Why an ASIC or FPGA?

ASICs can be an inexpensive part when manufactured in high volumes. You can buy a disposable calculator, a flash drive for pennies per gigabyte, an inexpensive cell phone; they are all powered by at least one ASIC. ASICs are also sometimes a necessity when speed is of the utmost importance, or the amount of logic that is needed is very large. However, in these cases, they are typically only used when cost is not a factor.

We can break down the costs of developing a product based on an ASIC or FPGA into Non-Recurring Engineering (NRE), the one-time cost to develop a chip, and the piece price for every chip, excluding NRE. Ed Sperling states the following in CEO Outlook: It Gets Much Harder From Here, Semiconductor Engineering, June 3, 2019, https://semiengineering.com/ceo-outlook-the-easy-stuff-is-over/:

"The NRE for a 7nm chip is $25 million to $30 million, including mask set and labor."

These costs include more than just the mask sets, the blueprint for the ASIC if you will, that is used to deposit the materials on the silicon wafers that build the chip. It's also the teams of design, implementation, and verification engineers that can number into the hundreds. Usually factored into ASIC costs are re-spins, which are bug fixes. These are a factor because large, complex devices struggle with first-time success.

Compare this to an FPGA. Fairly complex chips can be developed by a single person, or small teams. Most of the NRE has been shouldered by the FPGA vendor in the design of the FPGA chips, which are known good quantities. What little NRE is left is for tools and engineering. Re-spins are free, except for time, since the chip can be reprogrammed without million-dollar mask sets.

The trade-off is the per part cost. High volume ASICs with low complexity, like the one inside a pocket calculator or a digital watch, can cost pennies. CPUs can run into the hundreds or thousands of dollars. FPGAs, even the most inexpensive Spartan-7, start at a few dollars and the largest and fastest can stretch into the tens of thousands of dollars.

Another factor is tool costs. As we will see later in this chapter, Xilinx provides the Vivado tool suite for free in the form of a webpack for the smaller parts. This speeds adoption, where the barrier to entry is now a computer and a development board. Even the cost of developing more expensive parts is only a few thousand dollars if you need to purchase a professional copy of Vivado. ASIC tools can run into the millions of dollars and require years of training since the risk of failure is so high. As we will see in our projects, we'll make mistakes, sometimes to demonstrate a concept, but the cost to fix it will only be a few minutes of time, mostly to understand why it failed:

Figure 1.1 – Simple ASIC versus FPGA flow

The flow for an ASIC or FPGA is essentially the same. ASIC flows tend to be more linear, in that you have one chance to make a working part. With an FPGA, things such as simulation can become an option, although strongly suggested for complex designs. One difference is that the lab debug stage can also act as a form of simulation by using ChipScope, or similar on-chip debugging techniques, to monitor internal signals for debugging. The main difference is that each iteration through the steps costs only time in an FPGA flow. In this situation, any changes to a fabricated ASIC design requires some number of new mask sets that can run into the millions of dollars.

We've briefly looked at what an ASIC is and why we might choose an ASIC or an FPGA for a given application. Now, let's take a look at how an FPGA is created using an ASIC process.

How does a company create a programmable device using an ASIC process?

The basis of any ASIC technology is the transistor, with the largest devices holding up to a billion. There are multiple ASIC processes that have been developed over the years, and they all rely on 0s and 1s, in other words, on or off transistors. These on or off transistors can be thought of as Booleans, true or false values.

The basis of Boolean algebra was developed by George Bool in 1847. The fundamentals of Boolean algebra make up the basis of the logic gates upon which all digital logic is formed. The code that we will be writing will be at a fairly high level, but it is important to understand the basics and it will give us a good springboard for a first project.

Fundamental logic gates

There are four basic logic gates. We typically write the truth tables for the gates to understand their functionality. A truth table shows us what the output is for every set of inputs to a circuit. Refer to the following example involving a NOT gate.

Important note

In this section, we are primarily discussing only the logical functions. There are equivalent bitwise functions that will be introduced. Logical functions are typically used in if statements, bitwise functions in logic operations.

Assign statement

In SystemVerilog, we can use an assign statement to take whatever value is on the right-hand side of the equal sign to the left-hand side. Its usage is as follows:

assign out = in;

in can be another signal, function, or operation on multiple signals. out can be any valid signal declared prior to the assign statement.

Comments

SystemVerilog provides two ways of commenting. The first is using a double slash, //. This type of comment runs until the end of the line it's located on. The second type of comment is a block comment. Both are shown here:

// Everything here is a comment./* I can span    Multiple    Lines */

if statement

SystemVerilog provides a way of testing conditions via the if statement. The basic syntax is as follows:

if (condition) event

We will discuss if statements in more detail in Chapter 2, Combinational Logic.

Logical NOT (!)

The output of the NOT gate produces the inverse of the signal going in. The function in SystemVerilog can be written as follows:

assign out = !in; // logical or boolean operator

The associated truth table is as follows:

Figure 1.2 – NOT gate representation

The NOT gate is one of the most common operators we will be using:

if (!empty) ...

Often, we need to test a signal before performing an operation. For example, if we are using a First in First Out (FIFO) storage to smooth out sporadic data or for crossing clock domains, we'll need to test whether there is data available before popping it out for use. FIFOs have flags used for flow control, the two most common being full and empty. We can do this by testing the empty flag, as shown previously.

We will go into greater depth in later chapters on how to design a FIFO as well as use one.

Logical AND (&&), bitwise AND (&)

Often, we will want to test whether one or more conditions are active at the same time. To do this, we will be using the AND gate.

The function in SystemVerilog can be written as follows:

assign out = in1 && in0; // logical or boolean operator

The associated truth table is as follows:

Figure 1.3 – AND gate representation

Continuing our FIFO example, you might be popping from one FIFO and pushing into another:

if (!src_fifo_empty && !dst_fifo_full) ...

In this case, you want to make sure that both the source FIFO has data (is not empty) and that the destination is not full. We can accomplish this by testing it via the if statement.

Logical OR (||), bitwise OR (|)

Another common occurrence is to check whether any one signal out of a group is set to perform an operation.

The function in SystemVerilog can be written as follows:

assign out = in1 || and in0; // logical or boolean operator

The associated truth table is as follows:

Figure 1.4 – OR gate representation

Next, we will look at the exclusive OR function.

XOR (^)

The exclusive OR function checks whether either one of two inputs is set, but not both. The function in SystemVerilog can be written as follows:

assign out = in1 ^ and in0; // logical or boolean operator

The associated truth table is as follows:

Figure 1.5 – XOR gate representation

This function is used in building adders, parity, and error correcting and checking codes. In the next section, we'll look at how an adder is built using the preceding gates.

More complex operations

We've looked at the basic components in the previous sections that make up every digital design. Here we'll look at an example of how we can put together multiple logic gates to perform work. For this we will introduce the concept of a full adder. A full adder takes three inputs, A, B, and carry in, and produces two outputs, sum and carry out. Let's look at the truth table:

Figure 1.6 – Full adder

The SystemVerilog code for the full adder written as Boolean logic would be as follows:

assign Sum = A ^ B ^ Cin;
assign Cout = A & B | (A^B) & Cin;

You'll notice that we are using the bitwise operators for AND (&) and OR (|) since we are operating on bits. From this straightforward, yet important example, you can see that real-world functionality can be built from these basic building blocks. In fact, all the circuits in an ASIC or FPGA are built this way, but luckily you don't need to dive into this level of detail unless you really want to thanks to the proliferation of High-Level Design Languages (HDLs), such as SystemVerilog.

Introducing FPGAs

A gate array in ASIC terms is a sea of gates with some number of mask steps that can be configured for a given application. This allows for a more inexpensive product since the company designing the ASIC only needs to pay for the masks necessary for configuring. The FPGA takes this one step further by providing the programmability of the fabric as part of the device. This does result in an increased cost as you are paying for interconnect you are not using and the storage devices necessary to configure the FPGA fabric, but allows for some cost reductions as these parts become standard devices that can be mass produced.

If we look at the functions in the previous section through the adder example, we can see one commonality; they can all be produced using a truth table. This becomes key in FPGA development. We can regard these truth tables as Read Only Memory (ROM) representations of the functions. In fact, we can regard them as Programmable ROMs (PROMs) in the case of building up our FPGA.

Let's take the example of the fundamental logic functions. We can reproduce any of them by utilizing a 2-input lookup table, which could look something like this:

Figure 1.7 – Two input LUT examples

This is an oversimplified example, but what we have are four storage elements, in this case flip-flops, but in the case of an actual FPGA, more likely a much simpler structure utilizing far fewer transistors. The storage elements are connected to one another such that their configuration can be loaded. By attaching other Lookup Tables (LUTs) to the chain, multiple LUTs can be configured at startup, or in the case of partial reconfiguration, during normal operation. By adding a flip-flop, we can see the final structure of the LUT take shape.

The power in the simplicity of the structure is the ability to replicate this design many times over. In the case of modern FPGAs, they are built of many tiles or columns of logic such as this, allowing a much simpler piece to be designed, implemented, and verified, and then replicated to produce the large gate count devices available. This allows for a range of lower cost devices with fewer columns of resources to larger devices with many more, some even using Stacked Silicon Interconnects (SSI), which allows multiple ASIC dies to be attached together via an interconnect substrate.

In 1985, Xilinx introduced the XC2064, what we would consider the first FPGA utilizing an array of 64 3-input LUTs with one flip-flop. The breakthrough with this design was that it was modular and had good interconnect resources. This entire part would be approximately equivalent to 1 Combination Logic Block (CLB) in the Artix-7 we will be targeting.

At the heart of an FPGA is the programmable fabric. The fabric consists of LUTs with associated flip-flops making up slices and ultimately CLBs. These blocks are all connected using a rich topology of routing channels, allowing for almost limitless configuration. FPGAs also contain many other resources that we will explore over the course of this book, block RAMs, Serial-Deserial (SERDES) cores, DSP elements, and many types of programmable I/O.

Exploring the Xilinx Artix-7 and 7 series devices

The FPGAs we will be looking at in this book are the Artix-7 series of devices. These devices are the highest performance per watt of the Xilinx 7 series devices. For a reasonable price, they feature a large amount of relatively high-performance logic to implement your designs. The FPGA components we will introduce here are common in the Spartan (low end), Kintex (mid-range) and Virtex (high end) parts in the 7 series.

Combinational logic blocks

ASICs are made up of logic gates based upon libraries provided by ASIC foundries, such as TSMC or Tower. These libraries can contain everything from AND, OR, and NOT gates to more complicated math cells and storage elements. When developing an FPGA, you will be targeting the same Boolean logic equations as you would in an ASIC. We will be using a very similar flow. However, the synthesis process will target the CLBs of the FPGA:

Figure 1.8 – Xilinx UG474 7 series FPGAs CLB users' guide figure 1-1 (used with permission)

A CLB consists of a pair of slices, each of which contains four 6-input LUTs and their eight flip-flops. Vivado (or optionally a third-party synthesis tool such as Synopsys Synplify) compiles the SystemVerilog code and maps it to these CLB elements. To fully explore the details of the CLB, I would suggest reading Xilinx UG474, 7 Series FPGAs CLB users' guide (https://www.xilinx.com/support/documentation/user_guides/ug474_7Series_CLB.pdf). At a high level, each LUT allows a degree of flexibility such that any Boolean function with 6 inputs can be implemented or two arbitrarily defined 5-input functions if they share common inputs. There is also dedicated high speed carry logic for arithmetic functions, which will be discussed in later chapters.

The slices come in two formats, SLICEL (logic) and SLICEM (memory). SLICEM is a superset of SLICEL. SLICEM adds the ability to configure the SLICE into a distributed RAM or shift register. There are approximately three times the number of SLICELs as SLICEMs. The following table for the two suggested development boards for this book shows the breakdown:

Although it is theoretically possible to instantiate and force the functionality of lower-level components, such as slices or LUTs, this is beyond the scope of this book, and a feature not widely used. We will be targeting CLB usage through Vivado synthesis of the HDL that we write.

Storage

Aside from the SLICEMs that make up the CLBs that can be used as memories or shift registers, FPGAs contain Block RAMs (BRAM) that are larger storage elements. The 7 series parts all have 36 Kb BRAM that can be split into two 18 Kb BRAMs. The following table shows the BRAM available in the parts on the recommended development boards:

BRAMs can be configured as follows:

True dual port memories – Two read/write ports.
Simple dual port memories – 1 read/1 write. In this case, a 36 Kb BRAM can be up to 72 bits wide and an 18 Kb BRAM up to 36 bits wide.
A single port.

Contents of BRAMs can be loaded at initialization and configured via a file or initial block in the code. This can be useful for implementing ROMs or start up conditions.

BRAMs in 7 series devices also contain logic to implement FIFOs. This saves CLB resources and reduces synthesis overhead and potential timing problems in a design. We will go over FIFOs in a later chapter.

All 36 Kb BRAMs have dedicated Error Correction Code (ECC) functions. As this is something more related to high reliability applications, such as medical-, automotive-, or space-based, something we will not go into detail on in this book.

Clocking

7 series devices implement a rich clocking methodology, which can be explored in detail in UG472 7 Series FPGAs clocking resources user guide (https://www.xilinx.com/support/documentation/user_guides/ug472_7Series_Clocking.pdf). For most purposes, our discussion in the PLL section will give you everything you need to know; however, the referenced document will delve into far more detail.

I/Os

For the most part, we will limit ourselves to the I/Os supported by the two targeted development boards. In general, the 7 series devices handle a variety of interfaces from 3.3v CMOS/TTL to LVDS and memory interface types. The boards we are using will dictate the I/Os defined in our project files. For more information on all the supported types, you can reference the UG471 7 Series FPGAs SelectIO resources user guide.

DSP48E1

FPGAs have a large footprint in Digital Signal Processing (DSP) applications that use a lot of multipliers and, more specifically, Multiply Accumulate (MAC) functions. One of the first innovations in FPGAs was to include hard multipliers followed by DSP blocks that could implement MAC functions:

Figure 1.9 – Xilinx UG479 7 series DSP48E1 users' guide figure 1-1 (used with permission)

One of the most expensive operations in an FPGA is arithmetic. In an ASIC, the largest and slowest operation is typically a multiplication operation, and the smaller or faster operation is an add operation. For this reason, for many years, FPGA manufacturers have been implementing hard arithmetic cores in their fabric. This makes the opposite true in an FPGA, where the slower operation is typically an adder, especially as the widths get larger. The reason for this is that the multiply has been hardened into a complex, pipelined operation. We will explore the DSP operator more in later chapters. The UG479 7 Series DSP48E1 user guide (https://www.xilinx.com/support/documentation/user_guides/ug479_7Series_DSP48E1.pdf) is a good reference if you are interested in delving into the details.

Evaluation boards

There is no shortage of FPGA evaluation boards available for us to purchase. One company that makes very affordable boards is Digilent. There are several nice features that their boards tend to include, but one of the best is that they have a USB to UART controller built in that Xilinx Vivado recognizes as a programming cable. This makes configuring the device painless. The recommended boards also have the added advantage of being powered over this same USB cable.

Nexys A7 100T (or 50T)

The Nexys A7 is the recommended board for this book. It has all the devices we'll target over the course of the book:

Figure 1.11 – Digilent Nexys A7 board

The board features are as follows:

Artix-7 XC7A100T or 50T
450+MHz operation
128 MB DDR2
Serial Flash
Built-in USB UART for downloading images and ChipScope debugging
MicroSD card reader
10/100 Ethernet PHY
PWM audio output/microphone input
Temperature sensor
3 axis accelerometer
16 switches
16 LEDs
5 pushbuttons
Two 3 color LED
Two 4-digit 7-segment displays
USB host device support
Five PMOD (one XADC)

Let's take a look at the breakdown of the two devices the Nexys board can be ordered with:

One benefit to choosing the XC7A100T is the additional RAM. Especially when starting out you may find yourself relying on chip debugging using ChipScope and the additional RAM will allow for additional storage for wider busses or longer capture times. We'll discuss ChipScope in a later chapter.

Basys 3

An alternative evaluation board is the Basys 3:

Figure 1.12 – Digilent Basys 3 board

This board has the same pushbuttons, LEDs, and switches, but only half the number of seven segment displays. We'll be developing code that can run on either board using these features. It does lack the DDR2 RAM, so it will limit using this for a framebuffer as we will introduce in a later chapter. It is also missing the temperature sensor, microphone, and audio, which we'll look at regarding serial interfaces. PMOD boards can be purchased that have this functionality, however, to overcome this limitation.

The board features are as follows:

Artix-7 XC7A35T
450+Mhz operation
128MB DDR2
Serial Flash
Built-in USB UART for downloading images and ChipScope debugging
MicroSD card reader
10/100 Ethernet PHY
PWM audio output/microphone input
16 switches
16 LEDs
5 pushbuttons
Two 3-color LEDs
Single 4-digit, 7-segment displays
USB host device support
Four PMODs (one dual purpose supporting XADC)

Let's now take a look at the breakdown of the Basys 3 board:

Important note

The Basys 3 board lacks the DDR 2 memory, accelerometers, and audio capabilities, which will be addressed in later chapters. PMODs are available for everything apart from the DDR2. I would recommend the Nexys A7 over the Basys if possible.

We've just taken a look at the boards we are planning on using for the book. Now we need to take a look at the Xilinx tool, Vivado, which will be what we use to design, simulate, implement, and debug our FPGA designs.

Introducing Vivado

Once you have selected a board, the best way to get to know it is to work through an example design.

Vivado is the Xilinx tool we will be using to implement, test, download, and debug our designs. It can be run as a command-line tool in non-project mode, or in project mode using the GUI. For our purposes, we will be using project mode via the GUI; however, we will go through non-project mode as an introduction in the Appendix.

Vivado installation

Xilinx makes Vivado freely available in the form of a webpack for smaller devices. The webpack contains all the features of the full version with a limitation based on device support. It is available for either Windows or Linux. This book will show screenshots for the Linux version; however, everything is tested on both, so you will be fine with using either.

Important note

The Xilinx webpack forces tool feedback information to Xilinx. The paid version allows this to be disabled.

Perform the following steps to effect installation:

Create an account at https://www.xilinx.com/.
Visit https://www.xilinx.com/support/download.html.
Download the Xilinx Unified Installer. For this book, we'll be using version 2020.1.

On Windows, run the .exe file.

On Linux, use the following commands:

chmod +x Xilinx_Unified_2020.1_0602_1208_Lin64.bin; ./Xilinx_Unified_2020.1_0602_1208_Lin64.bin

Enter your account information for the installation.
When prompted, you can install either Vitis or Vivado. We will not be using Vitis, but it includes Vivado, so if you are adventurous and want to try Vitis out, feel free to install this as well.
When prompted for the devices, you only need the 7 Series.
Pick an installation location or use the default option.

Once you've completed these steps, get a cup of coffee… take a nap… write a book. It is going to take a while.

Directory structure

With Vivado installed, we can now walk through a very simple project to introduce you to Vivado and to make sure everything is set up correctly. The directory structure I like to use looks like the following:

Figure 1.13 – Directory structure

Items in bold are directories. For our first example design, we do not have a lot of code. We will end up creating only three files: the HDL source code, the testbench, and a constraints file.

Inside the hdl directory, we'll create a simple design, logic_ex.sv, to run through Vivado:

Logic_ex.sv

`timescale 1ns/10ps
module logic_ex
  (
   input  wire  [1:0]    SW,
   output logic [3:0]    LED
   );
  assign LED[0]  = !SW[0];
  assign LED[1]  = SW[1] && SW[0];
  assign LED[2]  = SW[1] || SW[0];
  assign LED[3]  = SW[1] ^ SW[0];
  endmodule // logic_ex

First, we'll define the timescale that we will be operating at in the simulator. 1ns/10ps was pretty standard years ago and for what we'll be doing, it will work fine. If you get involved using high-speed transceivers, you may encounter even smaller timescales, such as 1ps/1fs.

Tip

Each module should reside in its own file and the file should be named the same as the module. This can make life easier when using some tools, such as commercial simulators or even custom scripting.

The syntax for defining the timescale is as follows:

`timescale <time unit>/<time precision>

time unit defines the value and unit of delays. time precision specifies the rounding precision. This value can usually be overridden in the simulator and these settings have no effect on synthesis. When using `timescale, it is best to set it in all files:

We define a port list with one input, SW, which is a 2-bit value that we will connect to the two right-most switches on the board. We also define one output named LED, which are four bits that represent the four LEDs above the four right-most switches:

tb.sv

`timescale 1ns/ 100ps;
module tb;
  logic [1:0] SW;
  logic [3:0] LED;
  logic_ex u_logic_ex (.*);
  //logic_ex u_logic_ex (.SW, .LED);
  //logic_ex u_logic_ex (.SW(switch_sig), .LED(led_sig));
  //logic_ex u_logic_ex (.*, .LED(led_sig));

Here we declare a top-level module called tb. Note that the top-level testbench module should not have any ports. We also declare two logic types that we will hook up to the hello world module.

Here, we instantiate logic_ex as an instance, u_logic_ex. There are multiple ways of connecting ports. In the uncommented example, we are using .*, which will connect all ports with the same name as a defined signal in the instantiating module.

The second example (commented out) uses .<name> of the port you wish to connect. It requires the port name to already be defined.

Finally, if there is a signal with a different named signal, we could use the third example, which allows port renaming. It is possible to mix .* with renamed ports, as shown in the final example.

A testbench typically has two distinct parts, the stimulus generator and stimulus checker:

  // Stimulus
  initial begin
    $printtimescale(tb);
    SW = '0;
    for (int i = 0; i < 4; i++) begin
      $display("Setting switches to %2b", i[1:0]);
      SW = i[1:0];
      #100;
    end
    $display("PASS: logic_ex test PASSED!");
    $stop;
  end

The stimulus block is simple because the design we are testing is simple. We can nest it completely in an initial block. When the simulator starts up, the initial block runs serially. First it will print the timescale used in tb.sv. Then, SW input into the logic_ex module is set to 0. Using a '0 in the assignment to SW tells the tool to set all bits to 0. There is also an equivalent '1, which sets all bits to 1 or 'z, which would set all bits to z. Verilog sizing rules say that assigning SW = 0 is equivalent to SW = 32'b0, which would result in a sizing warning. To limit warnings, using '0, '1, or 'z is preferable.

Important note

SystemVerilog is an HDL and this is an important distinction. An HDL must be able to model parallel operations since many or all the slices in an FPGA will be running in parallel all the time. SW = '0; is a blocking assignment. So, the assignment is made before moving on. We will discuss blocking versus non-blocking when we discuss clocked processes.

The stimulus block then loops four times via the for loop. SystemVerilog has the capability of declaring the loop variable within the for loop, in this case i. It is highly advisable to declare it this way to avoid multiple driven net warnings if you are using the same signal in multiple for loops.

Within the for loop, we print out the current setting of the switches using the system task $display. Since we want to display only the 2 bits we are incrementing without leading 0s, we specify 2%b. We then set the value of SW to the lower two bits of i. Although we don't need to, we add in a delay of 100ns by using #100.

We are also using $stop, which will terminate the simulation run when reached.

Important note

We know that the delay is in ns because of the timescale we define in the test.

We also declare a checker block. In any good testbench, the checker block should be self-checking. This means that at the end of the test, we should be able to print whether the test passed or failed, and, if it failed, why. This also means that writing a testbench can often be as involved or even more involved than writing the code for the FPGA implementation. This is beyond the scope of this book, however. All commercial simulators, including the Vivado simulator, also support Universal Verification Methodology, which is a set of SystemVerilog classes and functions specifically for testing HDL designs:

  always @(SW, LED) begin
    if (!SW[0] !== LED[0]) begin
      $display("FAIL: NOT Gate mismatch");
      $stop;
    end
    if (&SW[1:0] !== LED[1]) begin
      $display("FAIL: AND Gate mismatch");
      $stop;
    end
    if (|SW[1:0] !== LED[2]) begin
      $display("FAIL: OR Gate mismatch");
      $stop;
    end
    if (^SW[1:0] !== LED[3]) begin
      $display("FAIL: XOR Gate mismatch");
      $stop;
    end
  end
endmodule

Conversely to the stimulus generation, we want this block to react to events from our design. We accomplish this by using an always block, which is sensitive just to changes on the SW inputs and LED outputs of the design. This is a simple case where we are matching each LED to the corresponding SW values run through their respective expected logic gate. We do this by using!==, which is not equals, but takes x's into account in case there is a bug in the design. We will see more complex testbenches in later chapters.

We are also using the reduction operators, &, |, and ^, which are applied to the two bits of SW. &SW[1:0] is equivalent to SW[0] & SW[1].

Running the example

You will want to copy the files for this book from GitHub at this point or clone the repository.

Loading the design

Let's load the design into Vivado:

Under Windows, locate the Vivado installation and double-click on the Vivado icon. Under Linux, the procedure is as follows:
```
Source <Vivado Install>/settings.sh (or.csh)
Vivado
```
Perform steps 2 and 3 the first time you run Vivado.
Open Xhub Stores:
Figure 1.14 – Xhub Stores
The Xilinx Xhub Stores are a convenient way of adding scripts, board files, and example designs to your Vivado installation.
Install the board files for the example projects.
Select the Boards tab, and then navigate to the Digilent Artix A7 100T or 35T and the Basys 3. You'll notice that there are quite a few commercial boards that easily make their files available for installation:
Figure 1.15 – Adding the Digilent boards
Select the open project and navigate to CH1/build/logic_ex/logic_ex.prj for the Nexys A7 board, or CH1/build/logic_ex/logic_ex_basys3.prj for the Basys 3 board, as shown in the following screenshot:

Figure 1.16 – Open Project window

Once open, you'll see the following:

Figure 1.17 – Vivado main screen for the logic_ex project

The Vivado project window gives us easy access to the design flow and all the information relating to the design. On the left-hand side, we see Flow Navigator. This gives us all the steps we will use to test and build our FPGA image. Currently, PROJECT MANAGER is highlighted. This gives us easy access to the sources in the design and the project summary. The project summary should be empty since we have loaded the project for the first time. On future loads of the project, it will display the information from the previous run.

Important note

To give you a jumpstart, all the projects in this book come complete with pre-set-up project files. Please see the appendix for instructions on setting up the first project in both project mode and non-project mode. This will guide you for setting up your own projects in the future.

Let's explore the sources in the design:

Figure 1.18 – Design sources

Here we can see our design, logic_ex.sv. We also have a set of constraints and we can see the testbench, tb.sv, instantiating logic_ex.sv under simulation sources. You can double-click on any of the files and explore them in the context-sensitive editor built into Vivado. The project is currently set up to reference the files in their current location within the directory structure, so the file can be edited with whatever your favorite editor is.

Looking at Project Summary, we can see the project is currently targeting the Arty A7-100 board.

Running a simulation

First, let's run the Vivado simulator to check the validity of our design.

To do this, click Run Simulation | Run Behavioral Simulation under PROJECT MANAGER. You will see that there are some other options available that are grayed out. These options allow you to run post synthesis or post implementation with or without timing. Behavioral simulation is relatively quick and will accurately represent the function of your design if the code is written properly. I would recommend not running post synthesis or implementation simulation unless you are debugging a board failure and need to accurately test the implemented version of the design as you'll find that the simulations will slow down dramatically.

Running the behavioral simulation will elaborate the design, the first step in the overall flow. The simulation view will take over the Vivado main screen:

Figure 1.19 – Simulation view

The Scope screen gives us access to the objects within a given module. In this case, within the testbench (tb), we can see two signals, SW[1:0] and LED[3:0]. I've added them to the waves and expanded the view:

Figure 1.20 – Wave view

The wave view allows us to look at the signals in the design and how they are behaving as the simulation progresses. This will be the most widely used feature of the simulator when debugging problems. We can see the SW signal incrementing due to the for loop in the testbench. Correspondingly, we see the LED values change. The current display is in hex, but it is possible to change it to binary or, by clicking on the > symbol to the right of the signal, to display the individual bits of the signal. Also notice that each change in the signals corresponds to a 100ns time advance. This is due to the #100 we are using to advance time and the timescale setting.

The final window is the most important for a self-checking testbench:

Figure 1.21 – Tcl Console

The Tcl console will display all the outputs from $display, or assertions in the design. In this case, we can see the output from our $printtimescale(tb) function as 1ns/ 100ps. We also see the values that the switches are set to and can see within the waves the same values. Finally, we see PASS: logic_ex test PASSED!, giving us the result of the test. I would encourage you to experiment with the testbench. Change the operators or invert them to verify that the test fails if you do. This exercise will give you confidence that the testbench is functioning correctly.

The goal of verification is not to ensure the design passes; it is to try to make it fail. This is a simple case, so it is not really possible, but make sure that you test unexpected situations to make sure your design is robust.

Tip

It is advisable to adopt a convention in how you indicate tests passing and failing. This test is simple. However, a much more robust test suite for an actual design may have random stimulus and many targeted tests. Adopting a convention such as displaying the words PASS and FAIL allows for easy post-processing of test results.

Implementation

Now that we have confidence that the design works as intended, it is time to build it and test it on the board.

First, let's look at the .xdc file. Click back on Project Manager in Flow Navigator, and then expand the constraints and double-click on the xdc file.

The following lines should be uncommented out for the Arty A7-100T to set the configuration voltages:

set_property CFGBVS VCCO [current_design]
set_property CONFIG_VOLTAGE 3.3 [current_design]

set_property is the tcl command, which will set a given design property used by Vivado. In the preceding command, we are setting CFGBVS and CONFIG_VOLTAGE to the values required by the Artix-7 FPGA.

The following code block sets up the switch and LED locations (placed together for convenience):

set_property -dict { PACKAGE_PIN J15   IOSTANDARD LVCMOS33 } [get_ports { SW[0] }]; #IO_L24N_T3_RS0_15 Sch=sw[0]
set_property -dict { PACKAGE_PIN L16   IOSTANDARD LVCMOS33 } [get_ports { SW[1] }]; #IO_L3N_T0_DQS_EMCCLK_14 Sch=sw[1]
set_property -dict { PACKAGE_PIN H17   IOSTANDARD LVCMOS33 } [get_ports { LED[0] }]; #IO_L18P_T2_A24_15 Sch=led[0]
set_property -dict { PACKAGE_PIN K15   IOSTANDARD LVCMOS33 } [get_ports { LED[1] }]; #IO_L24P_T3_RS1_15 Sch=led[1]
set_property -dict { PACKAGE_PIN J13   IOSTANDARD LVCMOS33 } [get_ports { LED[2] }]; #IO_L17N_T2_A25_15 Sch=led[2]
set_property -dict { PACKAGE_PIN N14   IOSTANDARD LVCMOS33 } [get_ports { LED[3] }]; #IO_L8P_T1_D11_14 Sch=led[3]

The set_property commands create a tcl dictionary (-dict) containing PACKAGE_PIN and IOSTANDARD for each port on the design. We use the get_port TCL command to return a port on the design. # is a comment in the TCL.

The pin locations and I/O standards are defined by the board manufacturer. They have used 3.3 V I/Os and the pins are as specified.

The steps to generate a bitstream are as follows:

Synthesis: Map SystemVerilog to an intermediate logic format for optimizing.
Implementation: Place the design, optimize the place results, and route the design.
Generate bitstream: Generate the physical file to download to the board.

These can be run individually. You might take this route if you need to look at the intermediate results to see how the area or timing is coming out, or if you are designing a custom board and need to do pin planning. In our case, we can click directly on Generate Bitstream and allow it to run all the steps automatically for us. Allow it to use the defaults. When complete, open the implemented results:

Figure 1.22 – Project Summary

Here we can see the summary of our implementation. We are using 2 LUTs and 6 I/Os (SW + LED). There is no timing since this design is purely combinational, otherwise we'd see more information regarding timing numbers.

If we click the Device tab, we can get a picture of how the device is being used:

Figure 1.23 – Device view

Here we can see the little white dot midway down the left-hand side. This represents where the LUTs are being placed.

Program the board

You have made it to the end of the chapter and now it's time to see the board in action:

Make sure it is plugged in and turned on.
Now, click on Open hardware manager, the last option under Flow Navigator. The hardware manager view will open in the main window.
Click Open target | Autoconnect.
Now, select the program device. The bitstream should be selected automatically. The lights will go out on the board for a few seconds and then, if the left two switches are down, you will be greeted with this:
Figure 1.24 – Board bringup
Flip the switches, and go through 00, 01, 10, 11, where 0 is down, 1 is up. Do the lights match the simulation? Do they match what you think they should be? Do you occasionally see one flicker as the switches are flipped? The last question will be answered in Chapter 3, Counting Button Presses.

Congratulations! You've completed your first project on an FPGA board. You've taken the first step on this journey and reconfigured the hardware in the FPGA to do some simple tasks. As we go through the book, the tasks will become more complex and more interesting and soon you'll be able to build upon this to create your own projects.

Filter reviews by

All

Feefo verified reviews

Amazon verified reviews

Stephen Eddy Mar 05, 2021

I've been a software developer in various languages for over 30 years, but FPGA development is unlike software development at a fundamental level, and it has taken me quite a while to come to terms with the difference. There are very few books on SystemVerilog that are aimed at a beginner to the subject, and even fewer that take the trouble to cover both the tooling used for development as well as the actual Verilog code needed for different logic function. The vast majority are aimed at electrical engineers with a good understanding of logic already, that want to transfer that knowledge to FPGA development.This book is different. It covers the basics of FPGA development including sequential logic, combinational logic and state machines along with more complex real world applications such as interfacing with external memory, integrating sensor data and generating VGA signals. It also covers how to use the various Xilinx IP's rather than reinventing the wheel for more complex functions and how to use AXI interfaces to communicate between different modules in a standardised way. It even covers maths operations and the unique features of SystemVerilog that are often overlooked in FPGA texts.This is an excellent introduction to a complex subject that I would recommend to anybody with an interest in the subject.

Amazon Verified review

Alan Apr 01, 2021

I highly recommend this book for beginner and intermediate verilog programmers and learners.I am a computer engineer by training, but have spent most of my career writing software. Recently I became more interested in building circuit boards, soldering, and filling in my hardware knowledge. A couple of years ago I came across the MiSTer FPGA project, and decided that I wanted to be able to learn verilog to help fix things and contribute new cores. For a software engineer, getting good at verilog can have a steep learning curve. Hardware is inherently parallel, and understanding the flow of verilog is different then traditional programming languages, even though the syntax is similar.FPGA Programming for Beginners is a great place to start, or improve your verilog skills. When you write your first design it is often a struggle to get it working. This book is very good at describing hands on, step by step how to simulate and debug the project on hardware. It will take you through the xilinx software and teach you how.As a beginner you often end up with a design that works intermittently. Either in simulation, but not on an FPGA board, or one one board, but not another. This book has practical tips and sections on how to catch many of the recurring problems in FPGA design.This book is not only filled with basics, but describes important details that are often "gotchas" when building real projects. For example, timing constraints are explained clearly and strategies to fix them are described. Also, the explanation of synchronizing data across clock domains was the first one I read that made sense to me. The state machine chapter taught me tricks to be able to debug state machines on the board much more easily. It also clearly described the different design patterns of state machines in verilog. Another chapter explains how to infer different kinds of ram on different FPGA boards.Overall this isn't a book that just describes verilog syntax. The book is filled with answers to real world problems and questions you would see in an interview for a job in the industry writing verilog.

Michael Collins Mar 15, 2021

As a professional programmer, the book was an excellent introduction into the world of digital logic design with FPGAs. It does well covering the basics and getting into deeper topics. I did some FPGA work in college and have taken a week long course or two during my career and this book, hands down, was more informative on the topic. Looking forward to learning more about FPGA and systemverilog in the future. Hopefully the author follows up with a book on more advanced topics. Would definitely recommend this book for beginners and even people intermediate skill levels.

Dan May 14, 2021

This is not "Verilog for Dummies" or a $100 text book. Its a book by someone who knows a lot about the subject trying to help you understand it better. It got me (retired programmer looking for new thrills) past the initial hurdles of using the software on an actual board. The prose is not always clear and I have noted a few typos and general sloppiness (book calls it Project 1 but then you end up loading the example named project_2). The author tries to include useful techniques (e.g. parameters) in the midst of an example for combinational logic which would be very helpful if it didn't confuse me. Ignorance tends to make me creative at misunderstanding things. At this price point I think this book is a good value, but I'm going to buy the $100 text book and get a better grounding in Verilog before I return to it.

Old-and-Wise Jan 13, 2024

I'm studying EE and consider myself still a beginner in FPGA even though I've passed courses in digital design and computer architecture. So I bought this book (the Kindle edition) in hope of learning how to program FPGAs. I haven't finished the book yet but I have to say I'm quite disappointed, because this book is NOT for beginners; it's really hard to understand and follow.A quick word on the Kindle version first: it's really POORLY formatted and make the difficult materials even harder to fathom. I should have bought the paperback edition. Luckily, the publisher lets you get a free PDF if you purchase either the paper or Kindle version; just fill out a form on their website. So now I've imported the (free, official) PDF into the Books app on my iPad, which makes reading and studying so much easier on the eye (if still not on the mind), plus I can use the Apple Pencil to mark up and annotate, which I do aplenty in hope of understanding all the materials in this book.The problems with the book are several fold. First, the author is really not good at explaining things, and his writing does not flow well or even seem logical at many places. There're also a lot of errors including misspelled commands in the text! What irks me the most is he often refers to files or chapters/sections that either don't exist or are misnamed, which adds to the agony of following along the projects. For example, he mentions an appendix in Chapter 1 about running Vivado from command line but there's no such appendix in either the Kindle edition or the PDF version. Or, is the command ROUTE or ROUTING?? Lots of what seem careless mistakes, and it did not go through technical reviews or a rigorous editing or even proofreading process.Code segments are usually not explained clearly and it takes me up to a dozen re-reads and tons of annotations and cross-checking on the web to understand what he's doing in each project, or even a small part of a project. He would do better by printing out each code file (even if in a small font like in Harris and Harris) and go through the code section by section, rather than talking about one section but you are expected to understand both how it works and how it places in the context of the module (he codes one module per SV file).Anyway, there're tons of problems with this book being presented as a beginning's guide. If they're going to do a 2nd edition, I hope the author and the editor (if tehre is one) will take some time and effort to make the presentation actually understandable by a beginner -- go find a beginner and ask them to study the draft first!

FPGA Programming for Beginners: Bring your ideas to life by creating hardware designs and electronic circuits with SystemVerilog

What do you get with Print?

Contact Details

Shipping Address

Billing Address

Key benefits

Description

Who is this book for?

What you will learn

Product Details

What do you get with Print?

Contact Details

Shipping Address

Billing Address

Product Details

Packt Subscriptions

Frequently bought together

Table of Contents

Recommendations for you

Customer reviews

Filter reviews by

People who bought this also bought

About the author

FAQs

Create a Free Account To Continue Reading

Sign in to activate your 7-day free access