This chapter gives you a brief overview of MongoDB, including answering such questions as what is MongoDB?, why use it?, and what are its benefits? It then covers installing the MongoDB Community Edition (free version) on a Windows server and on Linux. You will learn the installation differences between RPM based Linux distributions (Red Hat, Fedora, CentOS) and Deb based (Debian, Ubuntu). There is also a brief summary of how to install directly from source.
The topics that we will learn in this chapter are:
- Overview of MongoDB
- Installing MongoDB
- Installing MongoDB on Linux
MongoDB represents a radical and much needed departure from relational database technology. Dr. Edgar F. Codd (https://en.wikipedia.org/wiki/Edgar_F._Codd), an English computer scientist working for IBM, published his seminal paper, A Relational Model of Data for Large Shared Data Banks in 1970. It formed the basis for what we now know as RDBMS (Relational Database Management Systems), using SQL (Structured Query Language), adopted by IBM, Relational Software (later Oracle), and Ingres (https://en.wikipedia.org/wiki/Ingres_(database), a research project at the University of California in Berkeley. Ingres, in turn, spawned Postgres, Sybase, Microsoft SQL Server, and others.
The first version of MongoDB was introduced in 2009 by 10gen (https://en.wikipedia.org/wiki/MongoDB_Inc.#History) (later MongoDB Inc.) to address a crying need not addressed by the current stable of RDBMS systems, which were, for the most part, based on almost 50-year-old technology; handling big data and modeling objects. Initially proprietary, MongoDB was later released as open source.
One massive problem faced by legacy RDBMS systems is difficulty managing Big Data (https://en.wikipedia.org/wiki/Big_data). Examples would include data produced by the NASA Center for Climate Change, the Human Genome Project, which analyzes strands of DNA, or the Sloan Digital Sky Survey, which collects astronomical data. RDBMS systems are designed to maximize storage, which was an expensive resource 50 years ago. In the 21st century, storage costs have dropped dramatically, making this a secondary consideration. Another aspect of RDBMS systems is their ability to provide flexibility by way of creating relations between tables, which by its very nature introduces overheads, compounded when handling big data.
MongoDB addresses the needs of big data by incorporating modern algorithms such as map reduce (https://en.wikipedia.org/wiki/MapReduce), which allows for parallel distributed processing on a cluster of servers. In addition, MongoDB has a feature referred to as sharding, which allows fragments of a database to be stored and processed on multiple servers.
A classic paradox in object oriented programming (OOP) code that requires database access is caused by the two-dimensional architecture of the traditional RDBMS. The two dimensions, rows and columns, are in turn grouped into tables, much like a legacy spreadsheet. In order to achieve the third dimension one needs to perform resource intensive joins and form relationships between tables. In order to map programming object classes to the database, incredible programmatic gymnastics are required to achieve the goal.
With MongoDB, there is no rigid database schema you must adhere to. Instead of rows you insert documents. A set of documents is referred to as a collection. Each document can directly model an object class, which in turn greatly facilitates the work of storing and retrieving from the database.
MongoDB has its own rich query language, which can perform tasks similar to what the developer might expect from a legacy RDBMS using SQL. Because MongoDB does not use SQL, it is often referred to as a NoSQL database.
For the purposes of this book, we focus on the MongoDB Community Edition for the simple reason that it's free of charge. This version is also an excellent way to get your feet wet, so to speak, allowing you to learn about and experiment with MongoDB risk free. Before beginning installation, be sure to check the minimum requirements for your operating system in the MongoDB installation manual: https://docs.mongodb.com/manual/installation/.
The version featured in this book is MongoDB 4.0. The minimum requirement for a Windows installation is Windows Server 2008 R2, Windows 7, or later.
- Go to the MongoDB download center at: https://www.mongodb.com/download-center#community.
- Select the appropriate operating system where it says Version.
- Click on DOWNLOAD (msi):
- When prompted, choose Save File.
- Click on the saved MSI file to start the installer.
- Click OK when the security prompt appears asking to Open Executable File?
- Click Run when the security warning appears.
- Click Next to start the MongoDB Setup Wizard.
- Read the license agreement and click on the checkbox and Next. Note that if you do not accept the license agreement the installation will terminate.
- When asked Choose Setup Type, for the purposes of this illustration, select Complete. MongoDB Compass, which is a handy utility which greatly facilitates database management, is automatically installed.
- Now that all choices have been made, click on Install and click Yes when the User Account Control security warning pops up.
- As of MongoDB v4.0, the installation wizard lets you configure startup options. If you want to have MongoDB start automatically and run in the background, choose Run service as Network Service user. You can also configure the directory where MongoDB stores its data files (Data Directory), and where log files are stored (Log Directory):
- Click Next to continue and Finish when the installation completes.
Assuming you elected to install the complete package, MongoDB Compass will auto-launch once the installation completes. You will need to scroll down through its license agreement (separate from the license agreement for MongoDB itself), and click Agree. You can follow and then close the initial help tutorial, and also set Privacy Settings that control whether or not you will be sending crash reports, usage statistics, and requesting automatic updates to/from MongoDB Inc.
This utility is described in more detail in the Chapter 2, Understanding MongoDB Data Structures, We also use this utility to create our first database and collection (see following sections). Here is the Compass screen as seen on Windows:
If using the Windows MSI installer (recommended), the MongoDB program files will be stored here:
You have the option, during the installation process, of specifying the location where the database and log files are stored. Once finished, here is a look at the new directory structure:
The configuration file, which contains the locations of the database and log files, defaults to:
This file is automatically generated by the installer. By default, here are its contents:
It is important to understand the MongoDB installation process on Linux, even if you are a developer or IT professional and are not using Linux personally, it's extremely likely that the internet-facing server you or your customer use is running Linux. W3Techs (https://w3techs.com/), a company that does web technology surveys, estimates that in 2018, the running on Linux was at 68.1% compared with 32% for Windows.
There are three primary considerations when installing MongoDB on Linux, each of which we will address in turn:
- Linux based upon Debian and Ubuntu
- Linux based upon RedHat, Fedora, and CentOS
- Installing directly from source code
Debian Linux (https://www.debian.org/), self-described as the universal operating system, is a free open-source project that uses a fork of the Linux kernel, and draws heavily upon GNU (http://www.gnu.org/software/software.html, for example, GNU Not Unix) software. Ubuntu Linux (https://www.ubuntu.com/) is produced by the Canonical Group Ltd based in South Africa, and is based upon Debian. For the purposes of this book, we will focus on Ubuntu version 18.04, code-named Bionic Beaver, released in April 2018, a designated LTS (Long Term Support) version.
The preferred way to install any given software on Ubuntu is to use a Debian package. Such packages have the extension *.deb and include a script that tells the package management program where to place the pre-compiled binary files as they are extracted. Popular package management programs include synaptic (http://www.nongnu.org/synaptic/, graphical interface, resolves dependencies, and does a lot of "housekeeping"), aptitude (https://help.ubuntu.com/lts/serverguide/aptitude.html, like synaptic but has a textual, command-line menu), and apt-* (that is apt-get (https://linux.die.net/man/8/apt-get), apt-key so on: very fast, command-line only). For the purposes of this section we will use apt-get.
The MongoDB packages available for Ubuntu/Debian include the following:
Primary MongoDB system daemon
MongoDB shard routing service
Provides various mongo* tools for import, export, restore, and so on.
In addition, a composite package, mongodb-org, which contains all four of these packages, is provided.
To install MongoDB on an Ubuntu/Debian server, you will need root access. A unique feature of Debian-based Linux distributions is that direct login as root is not allowed for security reasons. Accordingly, you can promote yourself to root using su, or you can precede the various commands with sudo, which instructs the OS to process this command as root.
Please proceed as follows:
- Import the public key from the MongoDB key server. This is needed so that the package manager can authenticate the MongoDB package:
- Add the MongoDB repository to the Linux server's sources list:
- Refresh the package database from the sources list by running:
sudo apt-get update
The repository ... bionic/mongodb-org/4.0 ... does not have a Release file
In this situation, substitute the code name xenial (Ubuntu 16.04) in place of bionic (Ubuntu 18.04).
- Install the latest (stable) version of MongoDB. Here, we install only the composite package, which alleviates the need to separately install the four primary packages listed previously:
sudo apt-get install -y mongodb-org
If you followed the procedure outlined in the previous section, a configuration file /etc/mongod.conf will have been auto-generated by the installation script. By default, data files will be placed in /var/lib/mongodb and log files in /var/log/mongodb/mongod.log:
You are now able to perform these operations:
|Start | stop | restart the server||sudo service mongod start|stop|restart|
|Get the server status||sudo service mongod status|
|Access MongoDB via the shell (covered later)||mongo --host 127.0.0.1:27017|
Red Hat, Fedora, and CentOS have a relationship similar to that of Debian and Ubuntu. Red Hat (https://www.redhat.com/en) is the original company behind this distribution, producing its first release in 1995. In addition to making improvements in the graphical interface and overall management of Linux, Red Hat is known for its RPM (Red Hat Package Management) technology. In this corner of the Linux world, packages are bundled into files with the extension *.rpm, and contain installation instructions, which makes the installation, updating, and management of Linux software much easier.
Fedora (https://getfedora.org/) is a free open source version of what is now RHEL (Red Hat Enterprise Linux). Fedora and the Fedora Project are sponsored by Red Hat, and serve as a test bed for innovation, which, when stable, is ported to RHEL. Fedora Linux releases tend to have rapid development cycles and short lifespans. CentOS (https://www.centos.org/) is also affiliated with Red Hat, and is allowed direct use of RHEL source code. The main difference is that CentOS is free, but support is only available via the community (which is to say, you are on your own!). For the purposes of this book we will use CentOS version 7.
The MongoDB packages available for RHEL/Fedora/CentOS are exactly the same as those described in preceding sections for Debian/Ubuntu. Also, as described earlier, a composite package called mongodb-org that contains all four packages is available. Because RHEL/Fedora/CentOS packages use RPM for packaging, the tool of choice for installation, updating and management of packages is yum (Yellowdog Updater, Modified).
To install MongoDB on RHEL/Fedora/CentOS Linux distributions, proceed as follows:
- Create a repository file for yum in the /etc/yum.repos.d directory. The filename should be like this, mongodb-org-X.Y.repo, where X is the major version number for MongoDB, and Y is the minor release. As an example, for MongoDB version 4.0, the current version as of this writing, the filename would be:
- Install the composite package using: sudo yum install -y mongodb-org:
If you followed the procedure outlined previously, a configuration file /etc/mongod.conf will have been auto-generated by the installation script. By default, database files will be placed in /var/lib/mongodb and log files in /var/log/mongodb/mongod.log. Here is an example of the auto-generated file for MongoDB v4.0 on CentOS 7:
You are now able to perform these operations:
|Start | stop | restart the server||/bin/systemctl start|stop|restart mongod.service|
|Access MongoDB via the shell (covered later)||mongo --host 127.0.0.1:27017|
After starting the service, use the command /bin/systemctl status mongod.service to confirm the status of MongoDB:
The beauty of installing MongoDB directly from its source code is that it ensures that you can run MongoDB on any server, and that you have the absolute latest version. Minimum requirements for source installation include:
- A modern C++ 11 compiler
- Python (https://www.python.org/) 2.7 or above
- pip (https://pypi.org/project/pip/, tool for installing python packages)
- git (https://git-scm.com/, recommended)
In addition, there are OS-specific requirements, which are detailed in this table:
|Linux||Compiler: GCC 4.8.2 or later|
Red Hat, and suchlike.
Ubuntu, and suchlike.
|macOSX||Compiler: Clang 3.4 of XCode 5|
|Libraries: XCode (especially command line tools)|
|Windows||Compiler: Visual Studio 2013 Update 4 or later|
The source build process does not follow the traditional sequence of configure, make, and make install. Installation is performed using SCons (Software Construction Tool, https://www.scons.org/), which, in turn, uses the programming language Python. Accordingly, after you clone or download the MongoDB source, you will notice many Python scripts and configuration files.
For the purposes of this illustration, we use CentOS 7. To install MongoDB from source, assuming all prerequisites listed previously are met, proceed as follows:
- Download the source code from github.com. There are two ways to download the MongoDB source code from github.com:
- Download directly from this URL:
You would then need to unzip it into a folder such as /home/user/mongo.
- Download directly from this URL:
- If you have installed git, you can clone the repository from a command line terminal as follows:
- Change to the newly created (or cloned) mongo directory.
- Install pip requirements:
pip.exe install -r buildscripts\requirements.txt
- Build the source code using SCons:
At this point, you can then follow the same steps listed previously to run MongoDB:
- Create a directory for the database (for example /var/lib/mongo)
- Create a directory for the log (for example /var/log/mongo)
- Create a config file, which indicates the locations of the database and log (for example /etc/mongod.conf)
- Start MongoDB
In this chapter you gained a better understanding of what MongoDB is, why we use it, and what its benefits are. You then learned how to install MongoDB on both Windows and Linux. You learned how to install the pre-compiled binary packages, which use the extension *.deb and are designed for Debian and Ubuntu Linux package manager. In a similar manner, you learned how to install binary packages with the *.rpm extension on Redhat, Fedora, or CentOS Linux distributions. Finally, you learned how to install Linux by directly compiling and installing the source code using SCons technology.
In the next chapter you will learn about MongoDB data structures, data modeling, and how to create a database, collection, and documents.