Jupyter is a tool that allows data scientists to record their complete analysis process, much in the same way other scientists use a lab notebook to record tests, progress, results, and conclusions.
The Jupyter product was originally developed as part of the IPython project. The IPython project was used to provide interactive online access to Python. Over time it became useful to interact with other data analysis tools, such as R, in the same manner. With this split from Python, the tool grew into its current manifestation of Jupyter. IPython is still an active tool that's available for use. The name Jupyter itself is derived from the combination of Julia, Python, and R.
Jupyter is available as a web application from a number of places. It can also be used locally over a wide variety of installations. In this book, we will be exploring using Jupyter on a Mac and a Windows PC and over the Internet with other providers.
In this chapter, we will cover the following topics:
First look at Jupyter
Installing Jupyter on Windows
Installing Jupyter on Mac
Notebook structure
Notebook workflow
Basic notebook operations
Security in Jupyter
Configuration options for Jupyter
Here is a sample opening page when using Jupyter (this screenshot is on a Windows machine):

You should get yourself acquainted with the environment. The Jupyter user interface has a number of components:
Product title, Jupyter, in the top left (as expected). The logo and the title name are clickable and will return you to the Jupyter Notebook home page.
There are three tabs displayed: Files, Running, and Clusters:
The Files tab shows the list of files in the current directory of the page (described later on in this section).
The Running tab presents another screen of the currently running processes and notebooks. The drop-down lists for Terminals and Notebooks are populated with their running members:
The Clusters tab presents another screen to display the list of clusters available. This topic is covered in a later chapter:
In the top right corner of the screen are three buttons: Upload, New (menu), and a Refresh button.
The Upload button is used to add files to the notebook space. You may also just drag and drop as you would when handling files. Similarly, you can drag and drop notebooks into specific folders as well.
The menu with New at the top presents a further menu of Text File, Folder, Terminals Unavailable, Notebooks, and Python 2:
The Text File option is used to add a text file to the current directory. Jupyter will open a new browser window for you running a text editor. The text entered is automatically saved and will be displayed in your notebook's Files display:
The Folder option creates a new folder with the name
Untitled Folder
. Remember, all of the file/folder names are editable:The Terminals Unavailable option is disabled for Windows. On a Mac, the option allows you to start an IPython session.
The Notebooks option will be activated when additional notebooks are available in your environment.
The Python 2 option is used to begin a Python 2 session interactively in your notebook. The interface looks like the following screenshot. You have full file editing capabilities for your script, including saving as a new file. You also have a complete working IDE for your Python script:
The refresh button is used to update the display. It's not really necessary as the display is reactive to any changes in the underlying file structure.
At the top of the Files tab's item list is a checkbox, a drop-down menu, and a home button:
The checkbox is used to toggle all the checkboxes in the Items list
The drop-down menu presents a list of the choices available, Folders, All Notebooks, Running, and Files, as shown in the following screenshot:
The Folders selection will select all the folders in the display and present a count of the folders in the small box
The All Notebooks selection will change the count to the number of notebooks and provide you with three options:
Duplicate the current notebook
Shut down the current notebook
Trash the current notebook
You can see them in the following screenshot:
The Running selection will select any running scripts and update the count to the number selected
The Files selection will select all of the files in the notebook display and update the count accordingly
The home button brings you back to the home screen of the notebook.
On the left-hand side of every item is a checkbox, an icon, and the item's name:

The checkbox is used to build a set of files to operate upon.
The icon is indicative of the type of item. In this case, all of the items are folders.
The name of the item corresponds to the name of the object. In this case, the filenames are as used on the disk.
Jupyter requires Python to be installed (it is based on the Python language). There are a couple of tools that will automate the installation of Jupyter (and optionally Python) from a GUI. In this case, we are showing how to install using Anaconda, which is a Python tool for distributing software. You first have to install Anaconda. It is available on Windows and Mac environments. Download the executable from https://www.continuum.io/ (company that produces Anaconda) and run it to install Anaconda. The software provides a regular installation setup process, as shown in the following screenshot:

The installation process goes through the regular steps of making you agree to the distribution rights license:

The standard Windows installation allows you to decide whether all users on the machine can run the new software or not. If you are sharing a machine with different levels of users, then you can decide the appropriate action:

After clicking on Next, it will ask for a destination for the software to reside (I almost always keep the default paths):

And, most importantly, make sure that Python installed with Anaconda provides your Python basis going forward (by being placed in the execution path). Remember, Anaconda uses Python tool itself, so this is important.
Once Anaconda is installed, you need to run a command-line instruction to install Jupyter. The command is as follows:
conda install jupyter
This will invoke a process to download all the necessary components for Jupyter onto your PC. Your output should look something like this:
C:\Users\Dan>conda install jupyter Using Anaconda Cloud api site https://api.anaconda.org Fetching package metadata: .... Solving package specifications: ......... # packages in environment at C:\Users\Dan\Anaconda2: # jupyter 1.0.0 py27_2
Note
Additional lines will be present for an install. I have abbreviated the output. You now have Jupyter installed on your machine. You can start the process using the following command:
C:\Users\Dan>jupyter notebook
This command is starting a Jupyter Notebook server on your machine. Once the server is started, a browser instance will be opened at the starting point of the notebook. You should see logging statements similar to the following on your machine as the server starts:
[I 16:21:59.144 NotebookApp] Writing notebook server cookie secret to C:\Users\Dan\AppData\Roaming\jupyter\runtime\notebook_cookie_secret [I 16:21:59.846 NotebookApp] Serving notebooks from local directory: C:\Users\Dan [I 16:21:59.846 NotebookApp] 0 active kernels [I 16:21:59.846 NotebookApp] The Jupyter Notebook is running at: http://localhost:8888/ [I 16:21:59.862 NotebookApp] Use Control-C to stop this server and shut down all kernels (twice to skip confirmation).
Once Jupyter is running, you will notice a running icon for Jupyter (two inverted crescents) at the bottom of your screen:

Note, the last line of the log is the instruction you must use to stop the server (press Ctrl + C in the command-line window where the server is running).
If you press Ctrl + C in that window, the Jupyter server will shut down gracefully:
[W 17:26:36.688 NotebookApp] 404 GET /favicon.ico (::1) 62.00ms referer=None [W 17:26:36.750 NotebookApp] 404 GET /favicon.ico (::1) 0.00ms referer=None [I 17:28:24.891 NotebookApp] Interrupted... [I 17:28:24.891 NotebookApp] Shutting down kernels
You will notice that the Anaconda package has been installed on your application menu for further use:

On Mac, you can use the same Anaconda GUI (for Mac) as described in the previous section. You may also use the command-line tools available for Linux on your Mac.
You must first install Anaconda. Download the latest version and execute the embedded shell script to install.
Installing Jupyter on Mac is done through the command line using the conda install
command:
bmac:~ dtoomey$ conda install jupyter Fetching package metadata: .... Solving package specifications: .................................... Package plan for installation in environment /Users/dtoomey/anaconda:
The following packages will be downloaded:
package | build ---------------------------|----------------- mistune-0.7.2 | py27_1 178 KB setuptools-20.3 | py27_0 453 KB conda-4.0.5 | py27_0 185 KB pexpect-4.0.1 | py27_0 63 KB traitlets-4.2.1 | py27_0 108 KB ipython-4.1.2 | py27_2 931 KB jupyter_core-4.1.0 | py27_0 51 KB jupyter_client-4.2.2 | py27_0 96 KB jupyter_console-4.1.1 | py27_0 24 KB notebook-4.1.0 | py27_2 4.4 MB qtconsole-4.2.1 | py27_0 160 KB jupyter-1.0.0 | py27_2 2 KB ------------------------------------------------------------ Total: 6.6 MB
The following packages will be updated:
conda: 3.19.3-py27_0 --> 4.0.5-py27_0 ipython: 4.1.2-py27_0 --> 4.1.2-py27_2 jupyter: 1.0.0-py27_1 --> 1.0.0-py27_2 jupyter_client: 4.1.1-py27_0 --> 4.2.2-py27_0 jupyter_console: 4.1.0-py27_0 --> 4.1.1-py27_0 jupyter_core: 4.0.6-py27_0 --> 4.1.0-py27_0 mistune: 0.7.1-py27_0 --> 0.7.2-py27_1 notebook: 4.1.0-py27_0 --> 4.1.0-py27_2 pexpect: 3.3-py27_0 --> 4.0.1-py27_0 qtconsole: 4.1.1-py27_0 --> 4.2.1-py27_0 setuptools: 20.1.1-py27_0 --> 20.3-py27_0 traitlets: 4.1.0-py27_0 --> 4.2.1-py27_0 Proceed ([y]/n)? y Fetching packages ... mistune-0.7.2- 100% |#################| Time: 0:00:00 1.87 MB/s setuptools-20. 100% |#################| Time: 0:00:00 3.53 MB/s conda-4.0.5-py 100% |#################| Time: 0:00:00 2.47 MB/s pexpect-4.0.1- 100% |#################| Time: 0:00:00 1.26 MB/s traitlets-4.2. 100% |#################| Time: 0:00:00 1.71 MB/s ipython-4.1.2- 100% |#################| Time: 0:00:00 1.77 MB/s jupyter_core-4 100% |#################| Time: 0:00:00 2.34 MB/s jupyter_client 100% |#################| Time: 0:00:00 1.58 MB/s jupyter_consol 100% |#################| Time: 0:00:00 7.82 MB/s notebook-4.1.0 100% |#################| Time: 0:00:00 4.75 MB/s qtconsole-4.2. 100% |#################| Time: 0:00:00 1.37 MB/s jupyter-1.0.0- 100% |#################| Time: 0:00:00 2.71 MB/s Extracting packages ... [ COMPLETE ]|#############################################| 100% Unlinking packages ... [ COMPLETE ]|#############################################| 100% Linking packages ... [ COMPLETE ]|#############################################| 100%
A Jupyter Notebook is fundamentally a JSON file with a number of annotations. The main parts of the Notebook are as follows:
Metadata: A data dictionary of definitions used to set up and display the notebook
Notebook format: Version numbers of the software used to create the notebook (the version number is used for backward compatibility)
List of cells: There are different types of cell for markdown (display), code (to execute), and output (of the code type cells)
The typical workflow is as follows:
Create a new notebook for a project or data analysis.
Add your analysis steps, coding, and output.
Surround your analysis with organizational and presentation markdown to communicate an entire story.
Interactive notebooks (that include widgets and display modules) would then be used by others by modifying parameters and data to note the effects of their changes. Your markdown would present the cases that a user may want to investigate and probable results.
In this section, we describe the different operations that you can perform on your Jupyter Notebook. Most of the operations are menu functions that will change your display accordingly.
Let's walk through the basic file operations.
From the Files tab, we see a list of files and folders in the current notebook/disk folder. If we select (check) one of the files, we see the top-left menu change:

We now have choices of Duplicate, Rename, and delete (the trashcan icon). Note the number of files selected, 1, is displayed in the box as well.
If we hit the Duplicate button, we get a confirmation prompt with the name of the file selected for duplication:

Cancel will close the dialog. Duplicate will create another copy of the file with an appended copy number, as in the following screenshot. The original filename has been used with the addition of -Copyn
in the filename, where n
is the copy number. Note the original file extension, .properties
, has been maintained in the new file:

Similarly, if we hit the Rename button, another dialog box will appear to prompt the new filename to apply. The main filename has been highlighted as it assumes you want to maintain the file extension as the file type has not changed:

We can also delete the file by clicking on the trashcan icon. This brings up a confirmation dialog box:

At the top right of the screen we have options for Upload and New (Text File, Folder, or Python 2).
The Upload button is more meaningful when the notebook is stored on a web server. When running it on your desktop, it allows you to move files easily from one part of your notebook to another. If you click the button, you are presented with a file selector dialog box. The following screenshot is specific to a Windows environment, but a similar display is presented on a Mac. Once you select a file, it will be added to your notebook space:

If we opt to create a New Text File, we are presented with a new browser panel in the Jupyter text editor (Note that I have shrunk down the size of the screen so the display fits the boundaries of this book):

There are several points of interest on this screen:
We are in a new browser panel (the notebook display is still present in the other tab).
The name of the new file is
untitled1.txt
. Using the same convention as duplication, the new filename starts withuntitled.txt
and is incremented as needed.Curiously, it mentions when the file was created.
In the top-right corner, we see Plain Text. So, we might expect to see some other description here for other file types.
We have a new menu, File, Edit, View, and Language.
The File menu has the following options:
New: Start another new text window
Save: Save/update the current text file into the notebook area
Rename: Change the name of the file (unlikely you would want to keep the
untitledn
name provided)Download: Again, an option that makes more sense if your notebook is running on the Web. As explained for Upload, Download on a desktop installation allows you to copy a file to another part of your machine.
The Edit menu has the following options:
Find: Search for a string.
Find & Replace: Search for and replace a string.
Separator: The options for adjusting the text editor in use are below this line.
Key Map: Set your own function mapping for your keyboard.
Default: Checked as it is the default choice. This means to use the default text editor.
Sublime text: If you would prefer to use the Sublime editor.
Vim: If you would prefer to use Vim.
Emacs: If you would prefer to use Emacs.
The View menu only has an option to Toggle Line Numbers. I imagine future revisions of the package will have additional features. Similarly, for other file types, the menu may change.
The Language menu allows you to specify whether this text file is a specific type of programming file. This allows syntax highlighting, which is a major feature of source editors. The list is extensive:

The New Python 2 option creates a new Python 2 session. You are presented with a new browser panel with a similar naming convention, as seen in the following screenshot.
This is a very different presentation, where Python code is expected to be entered in the cells on the page with results displayed below each cell.
There is an extensive menu with File, Edit, View, Insert, Cell, Kernel, and Help options. We have a fairly complete Integrated Development Environment (IDE) for creating Python coding:

The File menu has the following options:
New Notebook: Start a new notebook (another browser panel like this one)
Open...: Select a file to open from the notebook Files view
Make a Copy...: Copy the current notebook completely into another browser panel
Rename...: Rename the current notebook
Save and Checkpoint: Save the current notebook and record a checkpoint
Note
A checkpoint is a point in time where all information about a notebook is preserved. You can have many checkpoints and return the state of your notebook to the previous checkpoint state at any time. This is an excellent way to give yourself the room to try out a new angle on your analysis without risking losing what you have done so far.
Revert to Checkpoint: Revert your notebook to a previous checkpoint
Print Preview: Present a preview of the printed form of your notebook
Download as: Download the notebook in a variety of formats:
IPython notebook (its current form)
IPython
HTML representation
Markdown-a specialized display format
reST--reStructuredText-an easy to read, plain text markup
PDF
Presentation
Close and Halt: Close the current notebook and stop any running scripts
The Edit menu has the following options:
Cut Cells: Cut the currently selected cells to the clipboard
Note
Each of the rectangular work areas in your notebook is a cell. The innermost text area is where you enter code. Below that (but within the surrounding rectangle), the results of each code stop will be displayed.
Copy Cells: Copy cells from the clipboard to the current cursor position
Paste Cells Above: Paste cells from the clipboard above the current cell
Paste Cells Below: Paste cells from the clipboard below the current cell
Paste Cells & Replace: Paste the cells from the clipboard on top of the current cell
Delete Cells: Delete the current cells
Undo Delete Cells: Revert the last Delete Cells invocation
Split Cell: Split up a cell from the current cursor position
Merge Cell Above: Merge the current cell with the one above
Merge Cell Below: Merge the current cell with the one below
Edit Notebook Metadata: Every notebook has underlying metadata that describes the characteristics of the notebook. Advanced users can manipulate this data directly in order to adjust features more readily. For example, the current notebook metadata looks like the following screenshot:

Find and Replace: Allow us to find and replace among the selected cells. There is a standardized dialog box for this, as shown in the following screenshot:

As seen in the preceding screenshot, the parameters and their functions are as follows:
The Aa icon toggle determines whether a case-insensitive search is made
The * icon toggle determines whether a regex search is made
The stacked lines icon toggle is whether a replace will be made
The Find text block presents the search criteria
The Replace text block is used for the replacement text
The View menu has the following options:
Toggle Header: Toggles the display of the Jupyter logo and filename
Toggle Toolbar: Toggles the display of the toolbar
Cell Toolbar: Toggles the display of the cell action icons
The Insert menu has the following options:
Insert Cell Above: Add a new cell above the current one
Insert Cell Below: Add a new cell below the current one
The Cell menu has the following options:
Run Cells: Run the selected (or all) cells.
Run Cells and Select Below: Run the current cells down and create a new one below.
Run Cells and Insert Below: Run the current cells and create a new one above.
Run All: Run all cells.
Run All Above: Run all cells prior to the current cell.
Run All Below: Run all cells below the current cell.
Cell Type: Change the type of cell selected to Code, Markdown, or Raw NBConvert. There is an automatic message that is displayed noting that all cells are by default Code type.
Current Outputs and All Output have options to toggle their display.
The Kernel menu has the following options:
Interrupt: Send a keyboard interrupt, Ctrl + C, to the kernel. This is useful if your code is in an endless loop.
Restart: Restart the kernel.
Restart & Clear Output: Restart the kernel and clear all output anew.
Restart & Run All: Restart the kernel and run all cells.
Reconnect: Connect back to a remote notebook.
Change Kernel: Not useful as only Python 2 is available at this point.
The Help menu has the following options:
User Interface Tour: Walk the user through a UI tour
Keyboard Shortcuts: Presents a list of built-in keyboard shortcuts
Notebook Help: Help topics on the notebook
Markdown: Description of the markdown available within a notebook
Python, IPython, NumPy, SciPy, Matplotlib, SymPy, Pandas: Help topics on the various languages and packages that can be used in notebooks
About: A standard about box
There is an icon panel below the menu that has shortcut icons for the following functions:
Floppy disk icon: Save and Checkpoint
Plus sign: Insert Cell Below
Scissors: Cut Cell
Duplicate pages: Copy Cell
Up arrow: Move Cell Up
Down arrow: Move Cell Down
An icon that looks like a speaker: Run the current cell
Black square: Interrupt Kernel
Circular arrow: Restart the Kernel
There's a drop-down menu for display characteristics:
Code
Markdown
Raw NBConvert
Heading
Keyboard: Open the command palette
Change the current toolbar in use. Clicking on the Cell Toolbar button auto-displays the Cell Toolbar choice from the View menu:

Jupyter notebooks are created in order to be shared with other users, in many cases over the Internet. However, Jupyter notebooks can execute arbitrary code and generate arbitrary code. This can be a problem if malicious aspects have been placed in a notebook. The default security mechanisms for Jupyter notebooks include the following:
Raw HTML is always sanitized (checked for malicious coding). Further information can be found at https://developers.google.com/caja.
You cannot run external JavaScript.
Cell contents (especially HTML and JavaScript) are not trusted (requires user validation to continue).
The output from any cell is not trusted.
All other HTML or JavaScript is never trusted. Clearing the output will cause the notebook to become trusted when saved.
Notebooks can also use a security digest to ensure the correct user is modifying the contents. A digest takes into account the entire contents of the notebook and a secret (only known by the notebook creator). This combination ensures that malicious coding is not going to be added to a notebook.
You add a security digest to a notebook using the following command:
~/.jupyter/profile_default/security/notebook_secret
Here, you replace the notebook_secret
part with your secret.
You can configure some of the display parameters used when presenting notebooks. These are configurable due to the use of a product (CodeMirror) to present and modify the notebook. CodeMirror is a JavaScript-based editor for use within web pages (notebooks).
The list of configurable options is still in development. Some of the options are as follows:
lineSeparator
: The character used to separate text linestheme
: The overall theme of presentation used in the notebookindentUnit
: How many spaces to indent blocks of coding
To change the configuration of one of the options, you open the JavaScript window of your browser, enter the coding to modify an option, and then load your notebook. Then the modifications you made would be applied to the notebook presentation. There is further documentation available at https://codemirror.net/doc/manual.html#option_indentUnit.
For example, to change the indentation (indent-unit) for your notebook, you would use the following JavaScript:
var mycell = Jupyter.notebook.get_selected_cell(); var cell_config = mycell.config; var code_patch = { CodeCell:{ cm_config:{indentUnit:2} } } cell_config.update(code_patch)
You have now seen all of the standard operations available to you in a Jupyter Notebook.
In this chapter, we investigated the various user interface elements available in a notebook. We learned how to install the software on a Mac or a PC. We were exposed to the notebook structure. We saw the typical workflow used when developing a notebook. We walked through the user interface operations available in a notebook. And lastly, we saw some of the configuration options available to advanced users for their notebook.
In the next chapter, we will learn all about Python scripting in a Jupyter Notebook.