This chapter gives you a brief overview of MongoDB, including answering such questions as what is MongoDB?,why use it?, and what are its benefits? It then covers installing the MongoDB Community Edition (free version) on a Windows server and on Linux. You will learn the installation differences between RPM based Linux distributions (Red Hat, Fedora, CentOS) and Deb based (Debian, Ubuntu). There is also a brief summary of how to install directly from source.
The topics that we will learn in this chapter are:
- Overview of MongoDB
- Installing MongoDB
- Installing MongoDB on Linux
MongoDB represents a radical and much needed departure from relational database technology. Dr. Edgar F. Codd (https://en.wikipedia.org/wiki/Edgar_F._Codd), an English computer scientist working for IBM, published his seminal paper, A Relational Model of Data for Large Shared Data Banks in 1970. It formed the basis for what we now know as RDBMS(Relational Database Management Systems), using SQL (Structured Query Language), adopted by IBM, Relational Software (later Oracle), and Ingres (https://en.wikipedia.org/wiki/Ingres_(database), a research project at the University of California in Berkeley. Ingres, in turn, spawned Postgres, Sybase, Microsoft SQL Server, and others.
The first version of MongoDB was introduced in 2009 by 10gen (https://en.wikipedia.org/wiki/MongoDB_Inc.#History) (later MongoDB Inc.) to address a crying need not addressed by the current stable of RDBMS systems, which were, for the most part, based on almost 50-year-old technology; handling big data and modeling objects. Initially proprietary, MongoDB was later released as open source.
DB-Engines (https://db-engines.com/en/ranking) provides up-to-date rankings of competing database systems. It is of interest to note that MongoDB is now in the Top 10, currently ranked fifth. However, you should also note that the score assigned to MongoDB is 343.79 compared with the number one ranked system, Oracle, with a score of 1,311.25.
One massive problem faced by legacy RDBMS systems is difficulty managing Big Data (https://en.wikipedia.org/wiki/Big_data). Examples would include data produced by the NASA Center for Climate Change, the Human Genome Project, which analyzes strands of DNA, or the Sloan Digital Sky Survey, which collects astronomical data. RDBMS systems are designed to maximize storage, which was an expensive resource 50 years ago. In the 21st century, storage costs have dropped dramatically, making this a secondary consideration. Another aspect of RDBMS systems is their ability to provide flexibility by way of creating relations between tables, which by its very nature introduces overheads, compounded when handling big data.
MongoDB addresses the needs of big data by incorporating modern algorithms such as map reduce (https://en.wikipedia.org/wiki/MapReduce), which allows for parallel distributed processing on a cluster of servers. In addition, MongoDB has a feature referred to as sharding, which allows fragments of a database to be stored and processed on multiple servers.
It should be noted that although MongoDB is designed to handle big data, it is actually more of a general purpose platform. If your only need is to handle big data, it might be worth your while to investigate Apache Cassandra (https://cassandra.apache.org/) with Hadoop (http://hadoop.apache.org/), which is expressly designed to handle massive amounts of data.
A classic paradox in object oriented programming (OOP) code that requires database access is caused by the two-dimensional architecture of the traditional RDBMS. The two dimensions, rows and columns, are in turn grouped into tables, much like a legacy spreadsheet. In order to achieve the third dimension one needs to perform resource intensive joins and form relationships between tables. In order to map programming object classes to the database, incredible programmatic gymnastics are required to achieve the goal.
With MongoDB, there is no rigid database schema you must adhere to. Instead of rows you insert documents. A set of documents is referred to as a collection. Each document can directly model an object class, which in turn greatly facilitates the work of storing and retrieving from the database.
MongoDB has its own rich query language, which can perform tasks similar to what the developer might expect from a legacy RDBMS using SQL. Because MongoDB does not use SQL, it is often referred to as a NoSQL database.
For an excellent introduction to NoSQL, its underlying philosophy and its ramifications, a highly recommended resource can be found in the NoSQL Guide (https://martinfowler.com/nosql.html) on Martin Fowler's website.
For the purposes of this book, we focus on the MongoDB Community Edition for the simple reason that it's free of charge. This version is also an excellent way to get your feet wet, so to speak, allowing you to learn about and experiment with MongoDB risk free. Before beginning installation, be sure to check the minimum requirements for your operating system in the MongoDB installation manual: https://docs.mongodb.com/manual/installation/.
On a live server, in a commercial enterprise, it is recommended you use the MongoDB Enterprise Advanced version. You might also consider two cloud-based offerings, MongoDB Atlas or MongoDB Stitch. The former provides a cloud-hosted MongoDB database service. The latter builds upon the former, opening the MongoDB API so that your apps can make calls and receive responses.
The MongoDB reference manual warns that Windows Server 2008 R2 or Windows 7 require a hotfix patch be installed to prevent errors from occurring under certain conditions. For more information see:https://support.microsoft.com/en-gb/help/2731284/33-dos-error-code-when-memory-memory-mapped-files-are-cleaned-by-using.
- Go to the MongoDB download center at: https://www.mongodb.com/download-center#community.
- Select the appropriate operating system where it says Version.
- Click on
- When prompted, choose
- Click on the saved MSI file to start the installer.
OKwhen the security prompt appears asking to
Open Executable File
Runwhen the security warning appears.
Nextto start the MongoDB Setup Wizard.
- Read the license agreement and click on the checkbox and
Next. Note that if you do not accept the license agreement the installation will terminate.
- When asked
Choose Setup Type, for the purposes of this illustration, select
Complete. MongoDB Compass, which is a handy utility which greatly facilitates database management, is automatically installed.
- Now that all choices have been made, click on
User Account Controlsecurity warning pops up.
- As of MongoDB v4.0, the installation wizard lets you configure startup options. If you want to have MongoDB start automatically and run in the background, choose
Run service as Network Service user. You can also configure the directory where MongoDB stores its data files (
Data Directory), and where log files are stored (
Nextto continue and
Finishwhen the installation completes.
Assuming you elected to install the complete package, MongoDB Compass will auto-launch once the installation completes. You will need to scroll down through its license agreement (separate from the license agreement for MongoDB itself), and click
Agree. You can follow and then close the initial help tutorial, and also set
Privacy Settings that control whether or not you will be sending crash reports, usage statistics, and requesting automatic updates to/from MongoDB Inc.
This utility is described in more detail in the Chapter 2, Understanding MongoDB Data Structures, We also use this utility to create our first database and collection (see following sections). Here is the Compass screen as seen on Windows:
You have the option, during the installation process, of specifying the location where the database and log files are stored. Once finished, here is a look at the new directory structure:
If you elected to install MongoDB as a service, it starts automatically, and can be administered just as any Windows service.
The configuration file, which contains the locations of the database and log files, defaults to:
It is important to understand the MongoDB installation process on Linux, even if you are a developer or IT professional and are not using Linux personally, it's extremely likely that the internet-facing server you or your customer use is running Linux. W3Techs (https://w3techs.com/), a company that does web technology surveys, estimates that in 2018, the running on Linux was at 68.1% compared with 32% for Windows.
There are three primary considerations when installing MongoDB on Linux, each of which we will address in turn:
With the bewildering array of Linux distributions currently available, it is difficult to decide which version to feature for the purposes of demonstrating MongoDB on Linux. A significant number of Linux distributions are based on either Debian or Red Hat Linux. Accordingly, this section covers installing MongoDB on both. A website which gives good insight on all reported Linux distributions is DistroWatch (https://distrowatch.com/). Linux Mint (https://linuxmint.com/), although now extremely popular, wasn't included here, as it's Debian-based and not as commercially available as Ubuntu.
Debian Linux (https://www.debian.org/), self-described as the universal operating system, is a free open-source project that uses a fork of the Linux kernel, and draws heavily upon GNU (http://www.gnu.org/software/software.html, for example, GNU Not Unix) software. Ubuntu Linux (https://www.ubuntu.com/) is produced by the Canonical Group Ltd based in South Africa, and is based upon Debian. For the purposes of this book, we will focus on Ubuntu version 18.04, code-named Bionic Beaver, released in April 2018, a designated LTS (Long Term Support) version.
The preferred way to install any given software on Ubuntu is to use a Debian package. Such packages have the extension
*.deb and include a script that tells the package management program where to place the pre-compiled binary files as they are extracted. Popular package management programs include synaptic (http://www.nongnu.org/synaptic/, graphical interface, resolves dependencies, and does a lot of "housekeeping"), aptitude (https://help.ubuntu.com/lts/serverguide/aptitude.html, like synaptic but has a textual, command-line menu), and
apt-* (that is
apt-key so on: very fast, command-line only). For the purposes of this section we will use
Ubuntu provides its own MongoDB package, which is what gets installed if you simply run
sudo apt-get install mongodb. To get the latest "official" version directly from MongoDB, you should follow the procedure outlined as follows. If you already have installed the Ubuntu
mongodb package, you will need to first uninstall it before proceeding.
The MongoDB packages available for Ubuntu/Debian include the following:
Primary MongoDB system daemon
MongoDB shard routing service
In addition, a composite package,
mongodb-org, which contains all four of these packages, is provided.
To install MongoDB on an Ubuntu/Debian server, you will need root access. A unique feature of Debian-based Linux distributions is that direct login as root is not allowed for security reasons. Accordingly, you can promote yourself to root using
su, or you can precede the various commands with
sudo, which instructs the OS to process this command as root.
Please proceed as follows:
- Import the public key from the MongoDB key server. This is needed so that the package manager can authenticate the MongoDB package:
- Add the MongoDB repository to the Linux server's sources list:
The commands listed should be on one line. We use a backslash (
\) to indicate a line of text that is too long to fit the printed page. When typing the command, omit the backslash (
\) and do not hit enter until the command has been fully entered.
- Refresh the package database from the sources list by running:
sudo apt-get update
Ubuntu version 18.04 is code-named bionic. You will note this name is used in step #2 here, where the MongoDB repository is added to the sources list. If this source is not found, you will receive an error message:
The repository ... bionic/mongodb-org/4.0 ... does not have a Release file
In this situation, substitute the code name
xenial (Ubuntu 16.04) in place of bionic (Ubuntu 18.04).
- Install the latest (stable) version of MongoDB. Here, we install only the composite package, which alleviates the need to separately install the four primary packages listed previously:
sudo apt-get install -y mongodb-org
If you followed the procedure outlined in the previous section, a configuration file
/etc/mongod.conf will have been auto-generated by the installation script. By default, data files will be placed in
/var/lib/mongodb and log files in
You are now able to perform these operations:
Start | stop | restart the server
Get the server status
Access MongoDB via the shell (covered later)
Red Hat, Fedora, and CentOS have a relationship similar to that of Debian and Ubuntu. Red Hat (https://www.redhat.com/en) is the original company behind this distribution, producing its first release in 1995. In addition to making improvements in the graphical interface and overall management of Linux, Red Hat is known for its RPM (Red Hat Package Management) technology. In this corner of the Linux world, packages are bundled into files with the extension
*.rpm, and contain installation instructions, which makes the installation, updating, and management of Linux software much easier.
Fedora (https://getfedora.org/) is a free open source version of what is now RHEL (Red Hat Enterprise Linux). Fedora and the Fedora Project are sponsored by Red Hat, and serve as a test bed for innovation, which, when stable, is ported to RHEL. Fedora Linux releases tend to have rapid development cycles and short lifespans. CentOS (https://www.centos.org/) is also affiliated with Red Hat, and is allowed direct use of RHEL source code. The main difference is that CentOS is free, but support is only available via the community (which is to say, you are on your own!). For the purposes of this book we will use CentOS version 7.
The MongoDB packages available for RHEL/Fedora/CentOS are exactly the same as those described in preceding sections for Debian/Ubuntu. Also, as described earlier, a composite package called
mongodb-org that contains all four packages is available. Because RHEL/Fedora/CentOS packages use RPM for packaging, the tool of choice for installation, updating and management of packages is yum (Yellowdog Updater, Modified).
To install MongoDB on RHEL/Fedora/CentOS Linux distributions, proceed as follows:
- Create a repository file for yum in the
/etc/yum.repos.ddirectory. The filename should be like this,
Xis the major version number for MongoDB, and
Yis the minor release. As an example, for MongoDB version 4.0, the current version as of this writing, the filename would be:
- Install the composite package using:
sudo yum install -y mongodb-org:
If you followed the procedure outlined previously, a configuration file
/etc/mongod.conf will have been auto-generated by the installation script. By default, database files will be placed in
/var/lib/mongodb and log files in
/var/log/mongodb/mongod.log. Here is an example of the auto-generated file for MongoDB v4.0 on CentOS 7:
You are now able to perform these operations:
Start | stop | restart the server
Access MongoDB via the shell (covered later)
After starting the service, use the command
/bin/systemctl status mongod.service to confirm the status of MongoDB:
The beauty of installing MongoDB directly from its source code is that it ensures that you can run MongoDB on any server, and that you have the absolute latest version. Minimum requirements for source installation include:
In addition, there are OS-specific requirements, which are detailed in this table:
Compiler: GCC 4.8.2 or later
Red Hat, and suchlike.
Ubuntu, and suchlike.
Compiler: Clang 3.4 of XCode 5
Libraries: XCode (especially command line tools)
Compiler: Visual Studio 2013 Update 4 or later
It is highly recommended that you carefully read through the source installation process documentation, which can be found on github.com at this URL: https://github.com/mongodb/mongo/wiki/Build-Mongodb-From-Source.
The source build process does not follow the traditional sequence of
make install. Installation is performed using SCons (Software Construction Tool, https://www.scons.org/), which, in turn, uses the programming language Python. Accordingly, after you clone or download the MongoDB source, you will notice many Python scripts and configuration files.
- Download the source code from github.com. There are two ways to download the MongoDB source code from github.com:
- Download directly from this URL:https://github.com/mongodb/mongo/archive/master.zip.
You would then need to unzip it into a folder such as
- Download directly from this URL:https://github.com/mongodb/mongo/archive/master.zip. You would then need to unzip it into a folder such as
- If you have installedgit, you can clone the repository from a command line terminal as follows:
- Change to the newly created (or cloned)
- Build the source code using SCons:
- Create a directory for the database (for example
- Create a directory for the log (for example
- Create a config file, which indicates the locations of the database and log (for example
- Start MongoDB
In this chapter you gained a better understanding of what MongoDB is, why we use it, and what its benefits are. You then learned how to install MongoDB on both Windows and Linux. You learned how to install the pre-compiled binary packages, which use the extension
*.deb and are designed for Debian and Ubuntu Linux package manager. In a similar manner, you learned how to install binary packages with the
*.rpm extension on Redhat, Fedora, or CentOS Linux distributions. Finally, you learned how to install Linux by directly compiling and installing the source code using SCons technology.
In the next chapter you will learn about MongoDB data structures, data modeling, and how to create a database, collection, and documents.