In this book, we are going to learn about RSpec in depth. But first, we need to lay some foundations. This chapter will introduce some important information that will prepare us for our exploration of RSpec.
First, we'll discuss the exciting promise of automated tests. We'll also discuss some of the pitfalls and challenges that are common when writing tests for real-world apps.
Next, we'll introduce the concept of testability, which will stay with us throughout this book. We'll then go over the technical assumptions made in the book.
We'll then start writing some simple unit tests with RSpec and explore the basic concepts of unit and test. We'll also start thinking about the usefulness of our tests and compare the cost of testing with its benefits.
Finally, we'll learn about two popular software methodologies: test-driven development (TDD) and behavior-driven development (BDD).
When I first learned about automated tests for software, it felt as if a door to a new world had opened up. Automated tests offered the promise of scientific precision and engineering rigor in software development, a process which I thought was limited by its nature to guesswork and trial and error.
This initial euphoria lasted less than a year. The practical experience of creating and maintaining tests for real-world applications gave me many reasons to doubt the promise of automated tests. Tests took a lot of effort to write and update. They often failed even though the code worked. Or the tests passed even when the code did not work. In either scenario, much effort was devoted to testing, without much benefit.
As the number of tests grew, they took longer and longer to run. To make them run faster, more effort had to be devoted to optimizing their performance or developing fancy ways of running them (on multiple cores or in the cloud, for example).
But even with many, many tests and more lines of test code than actual application code, many important features had no tests. This was rarely due to negligence but due to the difficulty of testing many aspects of real-world software.
Finally, bugs still popped up and tests were often written as part of the bug fix to ensure these bugs would not happen again. Just as generals always prepare for the last war, tests were ensuring the last bug didn't happen without helping prevent the next bug.
After several years of facing these challenges, and addressing them with various strategies, I realized that, for most developers, automated tests had become a dogma, and tests were primarily written for their own sake.
To benefit from automated tests, I believe one must consider the cost of testing. In other words, the effort of writing the test must be worth the benefits it offers. What I have learned is that the benefit of tests is rarely to directly prevent bugs, but rather to contribute to improved code quality and organization, which, in turn, will lead to more reliable software. Put another way, although automated tests are closely tied to quality assurance, their focus should be on quality, not assurance. This is just common sense if you think about it. How can we give assurance with automated (or manual) tests that a real-world piece of software, composed of thousands of lines of code, will not have bugs? How can we predict every possible use case, and how every line of code will behave?
Another issue is how to write tests. A number of challenges arise when testing complex applications in the real world. Should you use fixtures or mocks to test models? How should you deal with rack middleware in controller tests? How should you test code that interacts with external APIs and services? This book offers the essentials required to solve problems like these with RSpec, the popular Ruby testing library.
The goal of this book is to help you effectively leverage RSpec's many features to test and improve your code. Although we will limit ourselves to the most pertinent options, I encourage you to consult the official RSpec documentation (http://rspec.info/documentation/) to learn more about all the possible options. You should find it easy to build upon the examples here to develop a custom solution that exactly meets your own needs and preferences.
A fundamental concept that unites the chapters of this book is testability. When code is testable, we have confidence in its architecture and implementation. We can test it thoroughly with ease. Bugs are quickly detected and easily fixed. The first step to improving testability in an application is to establish a natural feedback loop between application code and test code, using signals from testing to improve application code. The energy devoted to writing complex tests for untestable code should be channeled into making the code more testable, allowing simpler tests to be written. With this feedback loop and focus on testability, tests contribute to code quality and application reliability.
Testability is not a binary quality. When looking at a given software system, we should ask, "How testable is this?", rather than trying to categorize it as testable or not testable. This requires judgment and common sense. As our features and priorities evolve, so must our criteria for testability. For example, let's consider a new web application with a small number of users, which has all kinds of automated tests for important features but none for testing performance under high load. This system can be considered to have high testability as long as we have few users and performance is not yet a concern. Once the web application becomes very popular and we need to serve millions of requests a day, we would have to change our judgment to say that the system now has very low testability. What use are all the tests that aren't related to performance if none of our users can reach our website because we cannot serve requests fast enough?
Testability should be achieved with efficiency. We need to figure out which features to test and not spend too much effort on tests that don't offer much value. As with testability, efficiency is not static and we must adjust the criteria for it as software evolves.
We can define testability as the degree to which a system can be verified to work as expected. At the smallest level, closest to the individual lines of code that make up our software, we are concerned with whether functions return the values we expect. At higher levels of abstraction, we are concerned with behaviors such as error handling, performance, and the correctness of entire end-to-end features. Let's keep in mind that testability includes manual tests as well. Manual testing is a normal part of development and quality assurance. If an aspect of a software system cannot be tested manually, it is very likely that it will be quite difficult to test it using automated tools as well.
Often, developers struggle to automate manual tests for a system with low testability. This common pitfall leads to high-cost, low-value tests and a system whose architecture and organization is not improved by the testing efforts. Our focus in this book will be on improving testability using automated tests written with RSpec. We will make both manual and automated tests better, with less effort required to create and maintain our tests. Both the architecture and organization of our system will benefit. By diverting some of our testing energy to improving the testability of the code, we will be engaged in a positive feedback loop, whereby our effort devoted to testing provides a meaningful benefit without excessive cost.
This book assumes that the reader is comfortable reading and writing Ruby code. Familiarity with RSpec is strongly recommended, though a total beginner to RSpec should find it possible to understand most of the recipes with the help of the online RSpec documentation. Each code example has been tested and works. I have used the latest stable versions available at the time of writing: Ruby 2.3.0 with RSpec 3.4.0.
RSpec 3 uses a different syntax from RSpec 2. Version 2.13 introduced a new syntax for assertions while 2.14 introduced a new syntax for doubles and expectations. RSpec 3.0 introduced a number of new features and changes as well. I have used the new syntax and features throughout the book:
require 'rspec' describe 'new RSpec syntax' do it "uses the new assertion syntax" do # new # deprecated expect(1 + 1).to eq(2) # (1 + 1).should == 2 end context "mocks and expectations" do let(:obj) do # new # deprecated double('foo') # obj = mock('foo') end it "uses the new allow syntax for mocks" do # new # deprecated allow(obj).to receive(:bar) # obj.stub(:bar) end it "uses the new expect syntax for expectations" do # new # deprecated expect(obj).to receive(:baz) # obj.should_receive(:baz) obj.baz end end end
Let's get started writing our first RSpec spec file before we delve deeper into the concepts of the unit and the assertion. First, let's try an empty file. What will happen if we create an empty file called empty.rb
and try to run it as an RSpec spec file? On a POSIX (Portable Operating System Interface) based operating system, such as Linux or OS X, we could do the following:

We can see that RSpec correctly reports that there are no examples in the file. However, we also notice that RSpec reports that there are zero failures, which is, strictly speaking, correct. Finally, the last line shows the exit status of the rspec empty.rb
command. An exit status of zero (0
) indicates success on POSIX systems, which means that our empty test succeeded.
This seems a bit odd. There isn't a bug in RSpec, and we haven't made any typos. It's important to keep this simplest of cases in the back of our minds, even as we start building very complex specs. This empty test is useless and doesn't serve any purpose.
Let's move on to an actual spec file now. We'll create a file called hello_world.rb
and put the following content in it:
require 'rspec' describe 'hello world' do it 'returns true' do expect('hello world').to eq('hello world') end end
Before we run this, let's have a look at what's in the file. Let's start from the inside out. The expect
method declares an assertion, which is then specified with the to
method together with the eq
method. There are a number of matchers in RSpec, the most common of which is eq
, which matches equality. Going out one layer, we see the it
method, which is how we declare an example in RSpec. Finally, the describe
method allows us to group one or more examples. We need to have at least one describe
block and we can nest them in case of multiple blocks.
Now we'll run the spec and see what we get back:

The spec passed again, and we see RSpec correctly detected that there was a single example in the file. The single dot on the first line of output looks odd when running a single spec, but it is a useful progress indicator when running a large number of specs, as there is one green dot for every passing spec and one red F
for every failing test.
Now, let's add a failing spec to see what the output looks like. We'll create a new file called hello_and_bye.rb
with the following content:
require 'rspec' describe 'hello and bye' do it 'returns true' do expect('hello').to eq('hello') end it 'fails' do expect('bye').to eq('hello') end end
Then we'll run the rspec
command on it:

This time we see that RSpec reports the failure, along with an explanation. We also notice that the exit status is no longer 0
, but 1
, which indicates failure. Any automated tools, such as continuous integration servers, would rely on that exit status to decide if our tests passed or failed, then react accordingly.
Now that we've seen some very rudimentary examples, let's remind ourselves of that first spec, the empty file. Are either hello_world.rb
or hello_and_bye.rb
any better than the empty file? Like the empty file, neither of these small spec files tests anything. We haven't even loaded any of our own code to test. But we've had to spend some effort to write the specs and haven't gotten anything in return. What's worse is that hello_and_bye.rb
is failing, so we have to put in a little effort to fix it if we want our test suite to pass. Is there a point to fixing that failure?
These questions may seem absurd. However, developers writing tests will face such problems all the time. The question is, should we even write a test? The answer is not clear. The empty file represents that situation when we skip writing a test. The other two files represent cases where we've written useless tests, and where we have to spend time fixing a useless test in order to keep our test suite passing.
As we delve into RSpec, we will write specs that are very complex. Nevertheless, the fundamental issue will be exactly the same one that we faced with the empty file, hello_world.rb
, and hello_and_bye.rb
. We have to write tests that are useful and avoid wasting energy on writing and maintaining tests that don't serve a good purpose. The situation will be more nuanced, a matter of degrees of usefulness. But, in short, we should always consider the option of not writing a test at all!
What is a unit of code? A unit is an isolated collection of code. A unit can be tested without loading or running the entire application. Usually, it is just a function. It is easy to determine what a unit is when dealing with code that is well organized into discrete and encapsulated modules. On the other hand, when code is splintered into ill-defined chunks that have cross-dependencies, it is difficult to isolate a logical unit.
What is a test? A test is code whose purpose is to verify other code. A single test case, (often referred to as an example in the RSpec community) consists of a set of inputs, one or more function calls, and an assertion about the expected output. A test case either passes or fails.
What is a unit test? It is an assertion about a unit of code that can be verified deterministically. There is an interdependency between the unit and the test, just as there is an interdependency between application code and test code. Finding the right unit and writing the right test go hand in hand, just as writing good application code and writing good test code go hand in hand. All of these activities occur as part of the same process, often at the same time.
Let's take the example of a simple piece of code that validates addresses. We could embed this code inside a User
model that manages a record in a database for a user, like so:
Class User ... def save if self.address.street =~ VALID_STREET_ADDRESS_REGEX && self.address.postal_code =~ VALID_POSTAL_CODE_REGEX && CITIES.include?(self.address.city) && REGIONS.include?(self.address.region) && COUNTRIES.include?(self.address.country) DB_CONNECTION.write(self) true else raise InvalidRecord.new("Invalid address!") end end ... end
Writing unit tests for the preceding code would be a challenge, because the code is not modular. The separate concern of validating the address is intertwined with the concern of persisting the record to the database. We don't have a separate way to only test the address validation part of the code, so our tests would have to connect to a database and manage a record, or mock the database connection. We would also find it very difficult to test for different kinds of error, since the code does not report the exact validation error.
In this case, writing a test case for the single User#save
method is difficult. We need to refactor it into several different functions. Some of these can then be grouped together into a separate module with its own tests. Finally, we will arrive at a set of discrete, logical units of code, with clear, simple tests.
So what would a good unit look like? Let's look at an improved version of the User#save
method:
Class User def valid_address? self.address.street =~ VALID_STREET_ADDRESS_REGEX && self.address.postal_code =~ VALID_POSTAL_CODE_REGEX && CITIES.include?(self.address.city) && REGIONS.include?(self.address.region) && COUNTRIES.include?(self.address.country) end def persist_to_db DB_CONNECTION.write(self) end def save if valid_address? persist_to_db true else false end end def save! self.save || raise InvalidRecord.new("Invalid address!") rescue raise FailedToSave.new("Error saving address: #{$!.inspect}") end ... end
Therefore, we write unit tests for two distinct reasons: first, to automatically test our code for correct behavior, and second, to guide the organization of our code into logical units.
Automated testing has evolved to include many categories of tests (for example, functional, integration, request, acceptance, and end-to-end). Sophisticated development methodologies have also emerged that are premised on automated verification, the most popular of which are TDD and BDD. The foundation for all of this is still the simple unit test. Code with good unit tests is good code that works. You can build on such a foundation with more complex tests. You can base your development workflow on such a foundation.
However, you are unlikely to get much benefit from complex tests or sophisticated development methodologies if you don't build on a foundation of good unit tests. Further, the same factors that contribute to good unit tests also contribute, at a higher level of abstraction, to good complex tests. Whether we are testing a single function or a complex system composed of separate services, the fundamental questions are the same. Is the assertion clear and verifiable? Is the test case logically coherent? Are the inputs and outputs precisely specified? Are error cases considered? Is the test readable and maintainable? Does the test often provide false positives (the test passes even though the system does not behave correctly) or false negatives (the test fails even though the system works correctly)? Is the test providing value, or is it more trouble than it's worth?
In summary, testing begins and ends with the unit test.
We have discussed a lot of theory; now, let's start applying it. We'll write a few specs for the AddressValidator
module defined below:
module AddressValidator FIELD_NAMES = [:street, :city, :region, :postal_code, :country] VALID_VALUE = /^[A-Za-z0-9\.\# ]+$/ class << self def valid?(o) normalized = parse(o) FIELD_NAMES.all? do |k| v = normalized[k] !v.nil? && v != "" && valid_part?(v) end end def missing_parts(o) normalized = parse(o) FIELD_NAMES - normalized.keys end private def parse(o) if (o.is_a?(String)) values = o.split(",").map(&:strip) Hash[ FIELD_NAMES.zip(values) ] elseif (o.is_a?(Hash)) o else raise "Don't know how to parse #{o.class}" end end def valid_part?(value) value =~ VALID_VALUE end end end
We'll store the code above in a file called address_validator.rb
. Let's start with a couple of simple tests in this chapter. In the next chapter, we'll explore a few different ways to expand and improve these tests, but for now we'll just focus on getting up and running with our first real RSpec tests.
We'll put the following code in a file called address_validator_spec.rb
in the same folder as address_validator.rb
:
require 'rspec' require_relative 'address_validator' describe AddressValidator do it "returns false for incomplete address" do address = { street: "123 Any Street", city: "Anytown" } expect( AddressValidator.valid?(address) ).to eq(false) end it "missing_parts returns an array of missing required parts" do address = { street: "123 Any Street", city: "Anytown" } expect( AddressValidator.missing_parts(address) ).to eq([:region, :postal_code, :country]) end end
Now, let's run RSpec (make sure you have it installed already!) like this:

That's it. We used a couple of options to format the output, which is self-explanatory. We'll dig deeper into how to run specs with various options in future chapters. For now, we've accomplished our goal of running RSpec for a couple of unit tests.
Now is a good time to reflect on the concepts of testability and the unit of code. How testable is our AddressValidator
module? Do you see any potential problems? What about the units we've tested? Are they isolated and modular? Do you see any places where we could do better? Take some time to review the code and think about these questions before moving on to the next section.
It seems to make sense to write your code first and then to test it, as we did in our AddressValidator
example above. Many people follow this approach. However, many others follow a process called TDD, where the tests are written first. Why do this? Let's take a brief aside before answering the question.
If you look at RSpec's official documentation, you will find that instead of the word test
, the word example
is used to describe the individual assertions to be found within the it
block. Although it may appear less natural than test
, in some ways example
is more accurate. Automated tests rarely provide conclusive proof that a software system, or even just one of its functions, works. Most often, they contain a few test cases, which are nothing but examples of the code in action. Moreover, one of the main benefits of an automated assertion is to document the way the code behaves. Whereas test
suggests a proof of correctness, example
just suggests an instance of the code in action.
Coming back to the question of why someone would write their test before their code, we can apply the concept of the example
. A methodical software engineer could benefit from documenting the code about to be written with some examples. Rather than adding these as comments in the code, the documentation can be written in the form of automated tests, or assertions. This way, as the code is being written, the tests can be run to give some feedback about how close, or how far, the code is to performing as initially expected.
If we refer to RSpec's home page, there is a link provided (https://relishapp.com/rspec), where we can read the following description:
RSpec is a Behaviour-Driven Development tool for Ruby programmers. BDD is an approach to software development that combines Test-Driven Development, Domain Driven Design, and Acceptance Test-Driven Planning. RSpec helps you do the TDD part of that equation, focusing on the documentation and design aspects of TDD.
We see that TDD is mentioned, but the first sentence identifies RSpec with BDD. Although a definition is given, it refers to three other methodologies, leaving us perhaps with only a vague impression of some fancy approach to software development. So what is BDD really?
BDD is an extension of the concepts of TDD to the complete functioning of a software system. Indeed, according to some proponents, BDD is a method for operating an entire organization!
Whereas TDD is concerned with tests and code, BDD is concerned with behaviors and benefits. BDD attempts to express the behavior of a system in plain, human language and justify the benefits that the behavior provides. TDD is written in code and does not attempt to justify the value of any part of the system. The loftiest vision of BDD is a methodology by which all features are specified and justified in clear human language, which can automatically be executed to verify that the system works as expected. Some other names sometimes used to refer to this lofty vision of BDD are Specification by Example and executable documentation.
If we look at our AddressValidator
example, mentioned previously, we have an example of TDD. If we were to create a BDD-oriented specification for it, we may start with something like this:
Feature: Address Validation As a postal customer, In order to ensure my packages are delivered, I want to validate addresses Scenario: Invalid address Given I enter "Seoul, USA" When I validate the address I should see the error message, "City and Country do not match"
This is the beginning of a Cucumber example. We won't go into Cucumber any further in this book, but it should be noted that RSpec is a closely related tool, and many of the developers who contribute to RSpec also contribute to Cucumber.
In the real world, the dividing line between TDD and BDD is not that clear. For most practical purposes, the only difference between TDD and BDD is in the style of the syntax used for expressions.
TDD leans more toward programmatic syntax, such as:
assert_equal(x, 5)
BDD, however, would use a syntax closer to human language, like RSpec's:
expect(x).to eq(5)
For the purposes of this book, we will strike a practical balance between TDD and BDD. Just by using RSpec, we are getting a hefty dose of BDD in our syntax. But we can still choose to structure our tests to follow the structure of our code (for example, having a single test for every function), which are nothing but unit tests. We can also choose to structure our tests according to high-level features, which is closer to BDD, or integration tests. In fact, we need to do a bit of both of these kinds of tests, as well as some tests that fall in between, which are sometimes called functional tests.
In this chapter, we have introduced the potential benefits and costs of automated testing, with a focus on the concept of testability, which we defined as the degree to which a system can be verified to work as expected. We learned about the importance of maintaining a positive balance between the benefits of testing and the cost of creating and maintaining tests. We then wrote a couple of simple unit tests and ran them with RSpec. Finally, we looked at different approaches to automated testing, from unit tests to TDD and BDD.