In the deep and wide landscape of data analysis, Excel stands tall and by your side as a trusted warrior, simplifying the process of organizing, calculating, and presenting information. Its intuitive interface and widespread usage have cemented its position as a staple in the business world. However, as the volume and complexity of data continue to grow exponentially, Excel’s capabilities may start to feel constrained. It is precisely at this point that the worlds of Excel, R, and Python converge. Extending Excel with R and Python invites you to embark on a truly transformative journey. This trip will show you the power of these programming languages as they synergize with Excel, expanding its horizons and empowering you to conquer data challenges with ease. In this book, we will delve into how to integrate Excel with R and Python, uncovering the hidden potential that lies beneath the surface and enabling you to extract valuable insights, automate processes, and unleash the true power of data analysis.
Microsoft Excel came to market in 1985 and has remained a popular spreadsheet software choice. Excel was originally known as MultiPlan. Microsoft Excel and databases in general share some similarities in terms of organizing and managing data, although they serve different purposes. Excel is a spreadsheet program that allows users to store and manipulate data in a tabular format. It consists of rows and columns, where each cell can contain text, numbers, or formulas. Similarly, a database is a structured collection of data stored in tables, consisting of rows and columns.
Both Excel and databases provide a way to store and retrieve data. In Excel, you can enter data, perform calculations, and create charts and graphs. Similarly, databases store and manage large amounts of structured data and enable querying, sorting, and filtering. Excel and databases also support the concept of relationships. In Excel, you can link cells or ranges across different sheets, creating connections between data. Databases use relationships to link tables based on common fields, allowing you to retrieve related data from multiple tables.
This chapter aims to familiarize you with reading Excel files into the R environment and performing some manipulation on them. Specifically, in this chapter, we’re going to cover the following main topics:
At the time of writing, we are using the following:
For this chapter, you will need to install the following packages:
readxl
openxlsx
xlsx
To run the Python code in this chapter, we will be using the following:
pandas
openpyxl
iris.xlsx
Excel file available in this book’s GitHub repositoryWhile setting up a Python environment is outside the scope of this book, this is easy to do. The necessary packages can be installed by running the following commands:
python -m pip install pandas==2.0.1 python -m pip install openpyxl==3.1.2
Note that these commands have to be run from a terminal and not from within a Python script. They need to be run in the folder where requirements.txt
resides or a full path to the requirements.txt
file has to be included.
This book’s GitHub repository also contains a requirements.txt
file that you can use to install all dependencies. You can do this by running the following command:
python -m pip install -r requirements.txt
This command installs all the packages that will be used in this chapter so that you don’t have to install them one by one. It also guarantees that the whole dependency tree (including the dependencies of the dependencies) will be the same as what this book’s authors have used.
Alternatively, when using Jupyter Notebooks, you can use the following magic commands:
%pip install pandas==2.0.1 %pip install openpyxl==3.1.2
There is a GitHub account for all of the code examples in this book located at this link: https://github.com/PacktPublishing/Extending-Excel-with-Python-and-R. Each chapter has it’s own folder, with the current one as Chapter01
.
Note
Technical requirements for Python throughout the book are conveniently compiled in the requirements.txt
file, accessible on GitHub repository here, https://github.com/PacktPublishing/Extending-Excel-with-Python-and-R/blob/main/requirements.txt. Installing these dependencies will streamline your coding experience and ensure smooth progression through the book. Be sure to install them all before diving into the exercises.
There are several packages available both on CRAN and on GitHub that allow for reading and manipulation of Excel files. In this section, we are specifically going to focus on the packages: readxl
, openxlsx
, and xlsx
to read Excel files. These three packages all have their own functions to read Excel files. These functions are as follows:
readxl::read_excel()
openxlsx::read.xlsx()
xlsx::read.xlsx()
Each function has a set of parameters and conventions to follow. Since readxl
is part of the tidyverse
collection of packages, it follows its conventions and returns a tibble
object upon reading the file. If you do not know what a tibble is, it is a modern version of R’s data.frame
, a sort of spreadsheet in the R environment. It is the building block of most analyses. Moving on to openxlsx
and xlsx
, they both return a base R data.frame
object, with the latter also able to return a list
object. If you are wondering how this relates to manipulating an actual Excel file, I can explain. First, to manipulate something in R, the data must be in the R environment, so you cannot manipulate the file unless the data is read in. These packages have different functions for manipulating Excel or reading data in certain ways that allow for further analysis and or manipulation. It is important to note that xlsx
does require Java to be installed.
As we transition from our exploration of R packages for Excel manipulation, we’ll turn our attention to the crucial task of effectively reading Excel files into R, thereby unlocking even more possibilities for data analysis and manipulation.
In this section, we are going to read data from Excel with a few different R libraries. We need to do this before we can even consider performing any type of manipulation or analysis on the data contained in the sheets of the Excel files.
As mentioned in the Technical requirements section, we are going to be using the readxl
, openxlsx
, and xlsx
packages to read data into R.
In this section, we are going to install and load the necessary libraries if you do not yet have them. We are going to use the openxlsx
, xlsx
, readxl
, and readxlsb
libraries. To install and load them, run the following code block:
pkgs <- c("openxlsx", "xlsx", "readxl") install.packages(pkgs, dependencies = TRUE) lapply(pkgs, library, character.only = TRUE)
The lapply()
function in R is a versatile tool for applying a function to each element of a list, vector, or DataFrame
. It takes two arguments, x
and FUN
, where x
is the list and FUN
is the function that is applied to the list object, x
.
Now that the libraries have been installed, we can get to work. To do this, we are going to read a spreadsheet built from the Iris dataset that is built into base R. We are going to read the file with three different libraries, and then we are going to create a custom function to work with the readxl
library that will read all the sheets of an Excel file. We will call this the read_excel_sheets()
function.
Let’s start reading the files. The first library we will use to open an Excel file is openxlsx
. To read the Excel file we are working with, you can run the code in the chapter1
folder of this book’s GitHub repository called ch1_create_iris_dataset.R
Refer to the following screenshot to see how to read the file into R.
You will notice a variable called f_pat
. This is the path to where the Iris dataset was saved as an Excel file – for example, C:/User/UserName/Documents/iris_data.xlsx
:
Figure 1.1 – Using the openxlsx package to read the Excel file
The preceding screenshot shows how to read an Excel file. This example assumes that you have used the ch1_create_iris_datase.R
file to create the example Excel file. In reality, you can read in any Excel file that you would like or need.
Now, we will perform the same type of operation, but this time with the xlsx
library. Refer to the following screenshot, which uses the same methodology as with the openxlsx
package:
Figure 1.2 – Using the xlsx library and the read.xlsx() function to open the Excel file we’ve created
Finally, we will use the readxl
library, which is part of the tidyverse:
Figure 1.3 – Using the readxl library and the read_excel() function to read the Excel file into memory
In this section, we learned how to read in an Excel file with a few different packages. While these packages can do more than simply read in an Excel file, that is what we needed to focus on in this section. You should now be familiar with how to use the readxl::read_excel()
, xlsx::read.xlsx()
, and openxlsx::read.xlsx()
functions.
Building upon our expertise in reading Excel files into R, we’ll now embark on the next phase of our journey: unraveling the secrets of efficiently extracting data from multiple sheets within an Excel file.
In Excel, we often encounter workbooks that have multiple sheets in them. These could be stats for different months of the year, table data that follows a specific format month over month, or some other period. The point is that we may want to read all the sheets in a file for one reason or another, and we should not call the read function from a particular package for each sheet. Instead, we should use the power of R to loop through this with purrr
.
Let’s build a customized function. To do this, we are going to load the readxl
function. If we have it already loaded, then this is not necessary; however, if it is already installed and you do not wish to load the library into memory, then you can call the excel_sheets()
function by using readxl::excel_sheets()
:
Figure 1.4 – Creating a function to read all the sheets into an Excel file at once – read_excel_sheets()
The new code can be broken down as follows:
read_excel_sheets <- function(filename, single_tbl) {
This line defines a function called read_excel_sheets
that takes two arguments: filename
(the name of the Excel file to be read) and single_tbl
(a logical value indicating whether the function should return a single table or a list of tables).
Next, we have the following line:
sheets <- readxl::excel_sheets(filename)
This line uses the readxl
package to extract the names of all the sheets in the Excel file specified by filename
. The sheet names are stored in the sheets
variable.
Here’s the next line:
if (single_tbl) {
This line starts an if
statement that checks the value of the single_tbl
argument.
Now, we have the following:
x <- purrr::map_df(sheets, read_excel, path = filename)
If single_tbl
is TRUE
, this line uses the purrr
package’s map_df
function to iterate over each sheet name in sheets
and read the corresponding sheet using the read_excel
function from the readxl
package. The resulting DataFrame
are combined into a single table, which is assigned to the x
variable.
Now, we have the following line:
} else {
This line indicates the start of the else
block of the if
statement. If single_tbl
is FALSE
, the code in this block will be executed.
Here’s the next line:
x <- purrr::map(sheets, ~ readxl::read_excel(filename, sheet = .x))
In this line, the purrr
package’s map
function is used to iterate over each sheet name in sheets
. For each sheet, the read_excel
function from the readxl
package is called to read the corresponding sheet from the Excel file specified by filename
. The resulting DataFrame
are stored in a list assigned to the x
variable.
Now, we have the following:
purrr::set_names(x, sheets)
This line uses the set_names
function from the purrr
package to set the names of the elements in the x
list to the sheet names in sheets.
Finally, we have the following line:
x
This line returns the value of x
from the function, which will be either a single table (data.frame
) if single_tbl
is TRUE
, or a list of tables (data.frame
) if single_tbl
is FALSE
.
In summary, the read_excel_sheets
function takes an Excel filename and a logical value indicating whether to return a single table or a list of tables. It uses the readxl
package to extract the sheet names from the Excel file, and then reads the corresponding sheets either into a single table (if single_tbl
is TRUE
) or into a list of tables (if single_tbl
is FALSE
). The resulting data is returned as the output of the function. To see how this works, let’s look at the following example.
We have a spreadsheet that has four tabs in it – one for each species in the famous Iris dataset and then one sheet called iris
, which is the full dataset.
As shown in Figure 1.5, the read_excel_sheets()
function has read all four sheets of the Excel file. We can also see that the function has imported the sheets as a list object and has named each item in the list after the name of the corresponding tab in the Excel file. It is also important to note that the sheets must all have the same column names and structure for this to work:
Figure 1.5 – Excel file read by read_excel_sheets()
In this section, we learned how to write a function that will read all of the sheets in any Excel file. This function will also return them as a named item list, where the names are the names of the tabs in the file itself.
Now that we have learned how to read Excel sheets in R, in the next section, we will cover Python, where we will revisit the same concepts but from the perspective of the Python language.
In this section, we will explore how to read Excel spreadsheets using Python. One of the key aspects of working with Excel files in Python is having the right set of packages that provide the necessary functionality. In this section, we will discuss some commonly used Python packages for Excel manipulation and highlight their advantages and considerations.
When it comes to interacting with Excel files in Python, several packages offer a range of features and capabilities. These packages allow you to extract data from Excel files, manipulate the data, and write it back to Excel files. Let’s take a look at some popular Python packages for Excel manipulation.
pandas
is a powerful data manipulation library that can read Excel files using the read_excel
function. The advantage of using pandas
is that it provides a DataFrame
object, which allows you to manipulate the data in a tabular form. This makes it easy to perform data analysis and manipulation. pandas
excels in handling large datasets efficiently and provides flexible options for data filtering, transformation, and aggregation.
openpyxl
is a widely used library specifically designed for working with Excel files. It provides a comprehensive set of features for reading and writing Excel spreadsheets, including support for various Excel file formats and compatibility with different versions of Excel. In addition, openpyxl
allows fine-grained control over the structure and content of Excel files, enabling tasks such as accessing individual cells, creating new worksheets, and applying formatting.
xlrd
and xlwt
are older libraries that are still in use for reading and writing Excel files, particularly with legacy formats such as .xls
. xlrd
enables reading data from Excel files, while xlwt
facilitates writing data to Excel files. These libraries are lightweight and straightforward to use, but they lack some of the advanced features provided by pandas
and openpyxl
.
When choosing a Python package for Excel manipulation, it’s essential to consider the specific requirements of your project. Here are a few factors to keep in mind:
pandas
, which have optimized algorithms, can offer significant performance advantages.pandas
, have a more extensive range of functionality, but they may require additional time and effort to master.Each package offers unique features and has its strengths and weaknesses, allowing you to read Excel spreadsheets effectively in Python. For example, if you need to read and manipulate large amounts of data, pandas
may be the better choice. However, if you need fine-grained control over the Excel file, openpyxl
will likely fit your needs better.
Consider the specific requirements of your project, such as data size, functionality, and compatibility, to choose the most suitable package for your needs. In the following sections, we will delve deeper into how to utilize these packages to read and extract data from Excel files using Python.
When working with Excel files in Python, it’s common to need to open a specific sheet and read the data into Python for further analysis. This can be achieved using popular libraries such as pandas
and openpyxl
, as discussed in the previous section.
You can most likely use other Python and package versions, but the code in this section has not been tested with anything other than what we’ve stated here.
pandas
is a powerful data manipulation library that simplifies the process of working with structured data, including Excel spreadsheets. To read an Excel sheet using pandas
, you can use the read_excel
function. Let’s consider an example of using the iris_data.xlsx
file with a sheet named setosa
:
import pandas as pd # Read the Excel file df = pd.read_excel('iris_data.xlsx', sheet_name='setosa') # Display the first few rows of the DataFrame print(df.head())
You will need to run this code either with the Python working directory set to the location where the Excel file is located, or you will need to provide the full path to the file in the read_excel()
command:
Figure 1.6 – Using the pandas package to read the Excel file
In the preceding code snippet, we imported the pandas
library and utilized the read_excel
function to read setosa
from the iris_data.xlsx
file. The resulting data is stored in a pandas
DataFrame
, which provides a tabular representation of the data. By calling head()
on the DataFrame
, we displayed the first few rows of the data, giving us a quick preview.
openpyxl
is a powerful library for working with Excel files, offering more granular control over individual cells and sheets. To open an Excel sheet and access its data using openpyxl
, we can utilize the load_workbook
function. Please note that openpyxl
cannot handle .xls
files, only the more modern .xlsx
and .
xlsm
versions.
Let’s consider an example of using the iris_data.xlsx
file with a sheet named versicolor
:
import openpyxl import pandas as pd # Load the workbook wb = openpyxl.load_workbook('iris_data.xlsx') # Select the sheet sheet = wb['versicolor'] # Extract the values (including header) sheet_data_raw = sheet.values # Separate the headers into a variable header = next(sheet_data_raw)[0:] # Create a DataFrame based on the second and subsequent lines of data with the header as column names sheet_data = pd.DataFrame(sheet_data_raw, columns=header) print(sheet_data.head())
The preceding code results in the following output:
Figure 1.7 – Using the openpyxl package to read the Excel file
In this code snippet, we import the load_workbook
function from the openpyxl
library. Then, we load the workbook by providing the iris_data.xlsx
filename. Next, we select the desired sheet by accessing it using its name – in this case, this is versicolor
. Once we’ve done this, we read the raw data using the values
property of the loaded sheet object. This is a generator and can be accessed via a for
cycle or by converting it into a list or a DataFrame
, for example. In this example, we have converted it into a pandas
DataFrame
because it is the format that is the most comfortable to work with later.
Both pandas
and openpyxl
offer valuable features for working with Excel files in Python. While pandas simplifies data manipulation with its DataFrame
structure, openpyxl
provides more fine-grained control over individual cells and sheets. Depending on your specific requirements, you can choose the library that best suits your needs.
By mastering the techniques of opening Excel sheets and reading data into Python, you will be able to extract valuable insights from your Excel data, perform various data transformations, and prepare it for further analysis or visualization. These skills are essential for anyone seeking to leverage the power of Python and Excel in their data-driven workflows.
In many Excel files, it’s common to have multiple sheets containing different sets of data. Being able to read in multiple sheets and consolidate the data into a single data structure can be highly valuable for analysis and processing. In this section, we will explore how to achieve this using the openpyxl
library and a custom function.
When working with complex Excel files, it’s not uncommon to encounter scenarios where related data is spread across different sheets. For example, you may have one sheet for sales data, another for customer information, and yet another for product inventory. By reading in multiple sheets and consolidating the data, you can gain a holistic view and perform a comprehensive analysis.
Let’s start by examining the basic steps involved in reading in multiple sheets:
load_workbook
function provided by openpyxl
.sheetnames
attribute. This allows us to identify the sheets we want to read.Openpyxl
provides methods such as iter_rows
or iter_cols
to traverse the cells of each sheet and retrieve the desired data.pandas
DataFrame
or a Python list. As we read the data from each sheet, we concatenate or merge it into the consolidated data structure:openpyxl
is a powerful library that allows us to interact with Excel files using Python. It provides a wide range of functionalities, including accessing and manipulating multiple sheets. Before we dive into the details, let’s take a moment to understand why openpyxl
is a popular choice for this task.
One of the primary advantages of openpyxl
is its ability to handle various Excel file formats, such as .xlsx
and .xlsm
. This flexibility allows us to work with different versions of Excel files without compatibility issues. Additionally, openpyxl
provides a straightforward and intuitive interface to access sheet data, making it easier for us to retrieve the desired information.
To begin reading in multiple sheets, we need to load the Excel workbook using the load_workbook
function provided by openpyxl
. This function takes the file path as input and returns a workbook object that represents the entire Excel file.
Once we have loaded the workbook, we can retrieve the names of all the sheets using the sheetnames attribute. This gives us a list of sheet names present in the Excel file. We can then iterate over these sheet names to read the data from each sheet individually.
openpyxl
provides various methods to access the data within a sheet.
Two commonly used methods are iter_rows
and iter_cols
. These methods allow us to iterate over the rows or columns of a sheet and retrieve the cell values.
Let’s have a look at how iter_rows
can be used:
# Assuming you are working with the first sheet sheet = wb['versicolor'] # Iterate over rows and print raw values for row in sheet.iter_rows(min_row=1, max_row=5, values_only=True): print(row)
Similarly, iter_cols
can be used like this:
# Iterate over columns and print raw values for column in sheet.iter_cols(min_col=1, max_col=5, values_only=True): print(column)
When using iter_rows
or iter_cols
, we can specify whether we want to retrieve the cell values as raw values or as formatted values. Raw values give us the actual data stored in the cells, while formatted values include any formatting applied to the cells, such as date formatting or number formatting.
By iterating over the rows or columns of a sheet, we can retrieve the desired data and store it in a suitable data structure. One popular choice is to use pandas
DataFrame
, which provide a tabular representation of the data and offer convenient methods for data manipulation and analysis.
An alternative solution is using the values
attribute of the sheet object. This provides a generator for all values in the sheet (much like iter_rows
and iter_cols
do for rows and columns, respectively). While generators cannot be used directly to access the data, they can be used in for
cycles to iterate over each value. The pandas
library’s DataFrame
function also allows direct conversion from a suitable generator object to a DataFrame
.
As we read the data from each sheet, we can store it in a list or dictionary, depending on our needs. Once we have retrieved the data from all the sheets, we can combine it into a single consolidated data structure. This step is crucial for further analysis and processing.
To combine the data, we can use pandas
DataFrame
. By creating individual DataFrame
for each sheet’s data and then concatenating or merging them into a single DataFrame
, we can obtain a comprehensive dataset that includes all the information from multiple sheets.
To simplify the process of reading in multiple sheets and consolidating the data, we can create custom functions tailored to our specific requirements. These functions encapsulate the necessary steps and allow us to reuse the code efficiently.
In our example, we define a function called read_multiple_sheets
that takes the file path as input. Inside the function, we load the workbook using load_workbook
and iterate over the sheet names retrieved with the sheets attribute.
For each sheet, we access it using the workbook object and retrieve the data using the custom read_single_sheet
function. We then store the retrieved data in a list. Finally, we combine the data from all the sheets into a single pandas
DataFrame
using the appropriate concatenation method from pandas
.
By using these custom functions, we can easily read in multiple sheets from an Excel file and obtain a consolidated dataset that’s ready for analysis. The function provides a reusable and efficient solution, saving us time and effort in dealing with complex Excel files.
The provided example is a starting point that you can customize based on your specific requirements. Here are a few considerations for customizing the code:
iter_cols
method instead of the values
attribute and using a filtered list in a for
cycle or by filtering the resulting pandas
DataFrame
object(s).Remember, the goal is to tailor the code to suit your unique needs and ensure it aligns with your data processing requirements.
By leveraging the power of openpyxl
and creating custom functions, you can efficiently read in multiple sheets from Excel files, consolidate the data, and prepare it for further analysis. This capability enables you to unlock valuable insights from complex Excel files and leverage the full potential of your data.
Now, let’s dive into an example that demonstrates this process:
from openpyxl import load_workbook import pandas as pd def read_single_sheet(workbook, sheet_name): # Load the sheet from the workbook sheet = workbook[sheet_name] # Read out the raaw data including headers sheet_data_raw = sheet.values # Separate the headers into a variable columns = next(sheet_data_raw)[0:] # Create a DataFrame based on the second and subsequent lines of data with the header as column names and return it return pd.DataFrame(sheet_data_raw, columns=columns) def read_multiple_sheets(file_path): # Load the workbook workbook = load_workbook(file_path) # Get a list of all sheet names in the workbook sheet_names = workbook.sheetnames # Cycle through the sheet names, load the data for each and concatenate them into a single DataFrame return pd.concat([read_single_sheet(workbook=workbook, sheet_name=sheet_name) for sheet_name in sheet_names], ignore_index=True) # Define the file path and sheet names file_path = 'iris_data.xlsx' # adjust the path as needed # Read the data from multiple sheets consolidated_data = read_multiple_sheets(file_path) # Display the consolidated data print(consolidated_data.head())
Let’s have a look at the results:
Figure 1.8 – Using the openxlsx package to read in the Excel file
In the preceding code, we define two functions:
read_single_sheet
: This reads the data from a single sheet into a pandas
DataFrame
read_multiple_sheets
: This reads and concatenates the data from all sheets in the workbookWithin the read_multiple_sheets
function, we load the workbook using load_workbook
and iterate over the sheet names. For each sheet, we retrieve the data using the read_single_sheet
helper function, which reads the data from a sheet and creates a pandas
DataFrame
for each sheet’s data, with the header row used as column names. Finally, we use pd.concat
to combine all the DataFrame
into a single consolidated DataFrame
.
By utilizing these custom functions, we can easily read in multiple sheets from an Excel file and obtain a consolidated dataset. This allows us to perform various data manipulations, analyses, or visualizations on the combined data.
Understanding how to handle multiple sheets efficiently enhances our ability to work with complex Excel files and extract valuable insights from diverse datasets.
In this chapter, we explored the process of importing data from Excel spreadsheets into our programming environments. For R users, we delved into the functionalities of libraries such as readxl
, xlsx
, and openxlsx
, providing efficient solutions for extracting and manipulating data. We also introduced a custom function, read_excel_sheets
, to streamline the process of extracting data from multiple sheets within Excel files. On the Python side, we discussed the essential pandas
and openpyxl
packages for Excel manipulation, demonstrating their features through practical examples. At this point, you should have a solid understanding of these tools and their capabilities for efficient Excel manipulation and data analysis.
In the next chapter, we will learn how to write the results to Excel.
Where there is an eBook version of a title available, you can buy it from the book details for that title. Add either the standalone eBook or the eBook and print book bundle to your shopping cart. Your eBook will show in your cart as a product on its own. After completing checkout and payment in the normal way, you will receive your receipt on the screen containing a link to a personalised PDF download file. This link will remain active for 30 days. You can download backup copies of the file by logging in to your account at any time.
If you already have Adobe reader installed, then clicking on the link will download and open the PDF file directly. If you don't, then save the PDF file on your machine and download the Reader to view it.
Please Note: Packt eBooks are non-returnable and non-refundable.
Packt eBook and Licensing When you buy an eBook from Packt Publishing, completing your purchase means you accept the terms of our licence agreement. Please read the full text of the agreement. In it we have tried to balance the need for the ebook to be usable for you the reader with our needs to protect the rights of us as Publishers and of our authors. In summary, the agreement says:
If you want to purchase a video course, eBook or Bundle (Print+eBook) please follow below steps:
Our eBooks are currently available in a variety of formats such as PDF and ePubs. In the future, this may well change with trends and development in technology, but please note that our PDFs are not Adobe eBook Reader format, which has greater restrictions on security.
You will need to use Adobe Reader v9 or later in order to read Packt's PDF eBooks.
Packt eBooks are a complete electronic version of the print edition, available in PDF and ePub formats. Every piece of content down to the page numbering is the same. Because we save the costs of printing and shipping the book to you, we are able to offer eBooks at a lower cost than print editions.
When you have purchased an eBook, simply login to your account and click on the link in Your Download Area. We recommend you saving the file to your hard drive before opening it.
For optimal viewing of our eBooks, we recommend you download and install the free Adobe Reader version 9.