MongoDB and Ruby have both been created as a result of technology getting complicated. They both try to keep it simple and manage all the complicated tasks at the same time. MongoDB manages "humongous" data and Ruby is fun. Working together, they form a great bond that gives us what most programmers desireâa fun way to build large applications!
Now that your interest has increased, we should first set up our system. In this chapter, we will see how to do the following:
Install Ruby using RVM
Install MongoDB
Configure MongoDB
Set up the initial playground using MongoDB tools
But first, what are the basic system requirements for installing Ruby and MongoDB? Do we need a heavy-duty server? Nah! On the contrary, any standard workstation or laptop will be enough. Ensure that you have at least 1 GB memory and more than 32 GB disk space.
Did you say operating system? Ruby and MongoDB are both cross-platform compliant. This means they can work on any flavor of Linux (such as Ubuntu, Red Hat, Fedora, Gentoo, and SuSE), Mac OS (such as Leopard, Snow Leopard, and Lion) or Windows (such as XP, 2000, and 7).
If you are planning on using Ruby and MongoDB professionally, my personal recommendations for development are Mac OS or Linux. As we want to see detailed instructions, I am going to use examples for Ubuntu or Mac OS (and point out additional instructions for Windows whenever I can). While hosting MongoDB databases, I would personally recommend using Linux.
Note
It's true that Ruby is cross-platform, most Rubyists tend to shy away from Windows as it's not always flawless. There are efforts underway to rectify this.
Let the games begin!
I recommend using RVM (Ruby Version Manager) for installing Ruby. The detailed instructions are available at http://beginrescueend.com/rvm/install/.
Note
Incidentally, RVM was called Ruby Version Manager but its name was changed to reflect how much more it does today!
On Linux or Mac OS you can run this initial command to install RVM as follows:
$ curl -L get.rvm.io | bash -s stable
$ source ~/.rvm/scripts/'rvm'
After this has been successfully run, you can verify it yourself.
$ rvm list known
If you have successfully installed RVM, this should show you the entire list of Rubies available. You will notice that there are quite a few implementations of Ruby (MRI Ruby, JRuby, Rubinius, REE, and so on) We are going to install MRI Ruby.
Note
MRI Ruby is the "standard" or original Ruby implementation. It's called Matz Ruby Interpreter.
The following is what you will see if you have successfully executed the previous command:
$ rvm list known
# MRI Rubies
[ruby-]1.8.6[-p420]
[ruby-]1.8.6-head
[ruby-]1.8.7[-p352]
[ruby-]1.8.7-head
[ruby-]1.9.1-p378
[ruby-]1.9.1[-p431]
[ruby-]1.9.1-head
[ruby-]1.9.2-p180
[ruby-]1.9.2[-p290]
[ruby-]1.9.2-head
[ruby-]1.9.3-preview1
[ruby-]1.9.3-rc1
[ruby-]1.9.3[-p0]
[ruby-]1.9.3-head
ruby-head
# GoRuby
goruby
# JRuby
jruby-1.2.0
jruby-1.3.1
jruby-1.4.0
jruby-1.6.1
jruby-1.6.2
jruby-1.6.3
jruby-1.6.4
jruby[-1.6.5]
jruby-head
# Rubinius
rbx-1.0.1
rbx-1.1.1
rbx-1.2.3
rbx-1.2.4
rbx[-head]
rbx-2.0.0pre
# Ruby Enterprise Edition
ree-1.8.6
ree[-1.8.7][-2011.03]
ree-1.8.6-head
ree-1.8.7-head
# Kiji
kiji
# MagLev
maglev[-26852]
maglev-head
# Mac OS X Snow Leopard Only
macruby[-0.10]
macruby-nightly
macruby-head
# IronRuby -- Not implemented yet.
ironruby-0.9.3
ironruby-1.0-rc2
ironruby-head
Isn't that beautiful? So many Rubies and counting!
Note
Fun fact
Ruby is probably the only language that has a plural notation! When we work with multiple versions of Ruby, we collectively refer to them as "Rubies"!
Before we actually install any Rubies, we should configure the RVM packages that are necessary for all the Rubies. These are the standard packages that Ruby can integrate with, and we install them as follows:
$ rvm package install readline
$ rvm package install iconv
$ rvm package install zlib
$ rvm package install openssl
The preceding commands install some useful libraries for all the Rubies that we will install. These libraries make it easier to work with the command line, internationalization, compression, and SSL. You can install these packages even after Ruby installation, but it's just easier to install them first.
$ rvm install 1.9.3
The preceding command will install Ruby 1.9.3 for us. However, while installing Ruby, we also want to pre-configure it with the packages that we have installed. So, here is how we do it, using the following commands:
$ export rvm_path=~/.rvm
$ rvm install 1.9.3 --with-readline-dir=$rvm_path/usr --with-iconv-dir=$rvm_path/usr --with-zlib-dir=$rvm_path/usr --with-openssl-dir=$rvm_path/usr
The preceding commands will miraculously install Ruby 1.9.3 configured with the packages we have installed. We should see something similar to the following on our screen:
$ rvm install 1.9.3
Installing Ruby from source to: /Users/user/.rvm/rubies/ruby-1.9.3-p0, this may take a while depending on your cpu(s)...
ruby-1.9.3-p0 - #fetching
ruby-1.9.3-p0 - #downloading
ruby-1.9.3-p0, this may take a while depending on your connection...
...
ruby-1.9.3-p0 - #extracting
ruby-1.9.3-p0 to /Users/user/.rvm/src/ruby-1.9.3-p0
ruby-1.9.3-p0 - #extracted to /Users/user/.rvm/src/ruby-1.9.3-p0
ruby-1.9.3-p0 - #configuring
ruby-1.9.3-p0 - #compiling
ruby-1.9.3-p0 - #installing
...
Install of ruby-1.9.3-p0 - #complete
Of course, whenever we start our machine, we do want to load RVM, so do add this line in your startup profile script:
$ echo '[[ -s "$HOME/.rvm/scripts/rvm" ]] && . "$HOME/.rvm/scripts/rvm" # Load RVM function' >> ~/.bash_profile
This will ensure that Ruby is loaded when you log in.
Note
$ rvm requirements
is a command that can assist you on custom packages to be installed. This gives instructions based on the operating system you are on!
Configuring RVM for a project can be done as follows:
$ rvm âcreate ârvmrc use 1.9.3%myproject
The previous command allows us to configure a gemset for our project. So, when we move to this project, it has a .rvmrc
file that gets loaded and voila â our very own custom workspace!
A gemset, as the name suggests, is a group of gems that are loaded for a particular version of Ruby or a project. As we can have multiple versions of the same gem on a machine, we can configure a gemset for a particular version of Ruby and for a particular version of the gem as well!
$ cd /path/to/myproject
Using ruby 1.9.2 p180 with gemset myproject
RVM does not work on Windows, instead you can use pik. All the detailed instructions to install Ruby are available at http://rubyinstaller.org/. It is pretty simple and a one-click installer.
Just like all good things, RVM becomes quite complex because the community started contributing heavily to it. Some people wanted just a Ruby version manager, so rbenv was born. Both are quite popular but there are quite a few differences between rbenv and RVM.
For starters, rbenv does not need to be loaded into the shell and does not override any shell commands. It's very lightweight and unobtrusive. Install it by cloning the repository into your home directory as .rbenv
. It is done as follows:
$ cd
$ git clone git://github.com/sstephenson/rbenv.git .rbenv
Add the preceding command to the system path, that is, the $PATH
variable and you're all set.
rbenv works on a very simple concept of shims. Shims are scripts that understand what version of Ruby we are interested in. All the versions of Ruby should be kept in the $HOME/.rbenv/versions
directory. Depending on which Ruby version is being used, the shim inserts that particular path at the start of the $PATH
variable. This way, that Ruby version is picked up!
This enables us to compile the Ruby source code too (unlike RVM where we have to specify ruby-head)
.
Note
For more information on rbenv, see https://github.com/sstephenson/rbenv.
I recommend using RVM (Ruby Version Manager) for installing Ruby. The detailed instructions are available at http://beginrescueend.com/rvm/install/.
Note
Incidentally, RVM was called Ruby Version Manager but its name was changed to reflect how much more it does today!
On Linux or Mac OS you can run this initial command to install RVM as follows:
$ curl -L get.rvm.io | bash -s stable
$ source ~/.rvm/scripts/'rvm'
After this has been successfully run, you can verify it yourself.
$ rvm list known
If you have successfully installed RVM, this should show you the entire list of Rubies available. You will notice that there are quite a few implementations of Ruby (MRI Ruby, JRuby, Rubinius, REE, and so on) We are going to install MRI Ruby.
Note
MRI Ruby is the "standard" or original Ruby implementation. It's called Matz Ruby Interpreter.
The following is what you will see if you have successfully executed the previous command:
$ rvm list known
# MRI Rubies
[ruby-]1.8.6[-p420]
[ruby-]1.8.6-head
[ruby-]1.8.7[-p352]
[ruby-]1.8.7-head
[ruby-]1.9.1-p378
[ruby-]1.9.1[-p431]
[ruby-]1.9.1-head
[ruby-]1.9.2-p180
[ruby-]1.9.2[-p290]
[ruby-]1.9.2-head
[ruby-]1.9.3-preview1
[ruby-]1.9.3-rc1
[ruby-]1.9.3[-p0]
[ruby-]1.9.3-head
ruby-head
# GoRuby
goruby
# JRuby
jruby-1.2.0
jruby-1.3.1
jruby-1.4.0
jruby-1.6.1
jruby-1.6.2
jruby-1.6.3
jruby-1.6.4
jruby[-1.6.5]
jruby-head
# Rubinius
rbx-1.0.1
rbx-1.1.1
rbx-1.2.3
rbx-1.2.4
rbx[-head]
rbx-2.0.0pre
# Ruby Enterprise Edition
ree-1.8.6
ree[-1.8.7][-2011.03]
ree-1.8.6-head
ree-1.8.7-head
# Kiji
kiji
# MagLev
maglev[-26852]
maglev-head
# Mac OS X Snow Leopard Only
macruby[-0.10]
macruby-nightly
macruby-head
# IronRuby -- Not implemented yet.
ironruby-0.9.3
ironruby-1.0-rc2
ironruby-head
Isn't that beautiful? So many Rubies and counting!
Note
Fun fact
Ruby is probably the only language that has a plural notation! When we work with multiple versions of Ruby, we collectively refer to them as "Rubies"!
Before we actually install any Rubies, we should configure the RVM packages that are necessary for all the Rubies. These are the standard packages that Ruby can integrate with, and we install them as follows:
$ rvm package install readline
$ rvm package install iconv
$ rvm package install zlib
$ rvm package install openssl
The preceding commands install some useful libraries for all the Rubies that we will install. These libraries make it easier to work with the command line, internationalization, compression, and SSL. You can install these packages even after Ruby installation, but it's just easier to install them first.
$ rvm install 1.9.3
The preceding command will install Ruby 1.9.3 for us. However, while installing Ruby, we also want to pre-configure it with the packages that we have installed. So, here is how we do it, using the following commands:
$ export rvm_path=~/.rvm
$ rvm install 1.9.3 --with-readline-dir=$rvm_path/usr --with-iconv-dir=$rvm_path/usr --with-zlib-dir=$rvm_path/usr --with-openssl-dir=$rvm_path/usr
The preceding commands will miraculously install Ruby 1.9.3 configured with the packages we have installed. We should see something similar to the following on our screen:
$ rvm install 1.9.3
Installing Ruby from source to: /Users/user/.rvm/rubies/ruby-1.9.3-p0, this may take a while depending on your cpu(s)...
ruby-1.9.3-p0 - #fetching
ruby-1.9.3-p0 - #downloading
ruby-1.9.3-p0, this may take a while depending on your connection...
...
ruby-1.9.3-p0 - #extracting
ruby-1.9.3-p0 to /Users/user/.rvm/src/ruby-1.9.3-p0
ruby-1.9.3-p0 - #extracted to /Users/user/.rvm/src/ruby-1.9.3-p0
ruby-1.9.3-p0 - #configuring
ruby-1.9.3-p0 - #compiling
ruby-1.9.3-p0 - #installing
...
Install of ruby-1.9.3-p0 - #complete
Of course, whenever we start our machine, we do want to load RVM, so do add this line in your startup profile script:
$ echo '[[ -s "$HOME/.rvm/scripts/rvm" ]] && . "$HOME/.rvm/scripts/rvm" # Load RVM function' >> ~/.bash_profile
This will ensure that Ruby is loaded when you log in.
Note
$ rvm requirements
is a command that can assist you on custom packages to be installed. This gives instructions based on the operating system you are on!
Configuring RVM for a project can be done as follows:
$ rvm âcreate ârvmrc use 1.9.3%myproject
The previous command allows us to configure a gemset for our project. So, when we move to this project, it has a .rvmrc
file that gets loaded and voila â our very own custom workspace!
A gemset, as the name suggests, is a group of gems that are loaded for a particular version of Ruby or a project. As we can have multiple versions of the same gem on a machine, we can configure a gemset for a particular version of Ruby and for a particular version of the gem as well!
$ cd /path/to/myproject
Using ruby 1.9.2 p180 with gemset myproject
RVM does not work on Windows, instead you can use pik. All the detailed instructions to install Ruby are available at http://rubyinstaller.org/. It is pretty simple and a one-click installer.
Just like all good things, RVM becomes quite complex because the community started contributing heavily to it. Some people wanted just a Ruby version manager, so rbenv was born. Both are quite popular but there are quite a few differences between rbenv and RVM.
For starters, rbenv does not need to be loaded into the shell and does not override any shell commands. It's very lightweight and unobtrusive. Install it by cloning the repository into your home directory as .rbenv
. It is done as follows:
$ cd
$ git clone git://github.com/sstephenson/rbenv.git .rbenv
Add the preceding command to the system path, that is, the $PATH
variable and you're all set.
rbenv works on a very simple concept of shims. Shims are scripts that understand what version of Ruby we are interested in. All the versions of Ruby should be kept in the $HOME/.rbenv/versions
directory. Depending on which Ruby version is being used, the shim inserts that particular path at the start of the $PATH
variable. This way, that Ruby version is picked up!
This enables us to compile the Ruby source code too (unlike RVM where we have to specify ruby-head)
.
Note
For more information on rbenv, see https://github.com/sstephenson/rbenv.
MongoDB installers are a bunch of binaries and libraries packaged in an archive. All you need to do is download and extract the archive. Could this be any simpler?
On Mac OS, you have two popular package managers Homebrew and MacPorts. If you are using Homebrew, just issue the following command:
$ brew install MongoDB
If you don't have brew installed, it is strongly recommended to install it. But don't fret. Here is the manual way to install MongoDB on any Linux, Mac OS, or Windows machine:
1. Download MongoDB from http://www.mongodb.org/downloads.
2. Extract the
.tgz
file to a folder (preferably which is in your system path).
It's done!
On any Linux Shell, you can issue the following commands to download and install. Be sure to append the /path/to/MongoDB/bin
to your $PATH
variable:
$ cd /usr/local/
$ curl http://fastdl.mongodb.org/linux/mongodb-linux-i686-2.0.2.tgz > mongo.tgz
$ tar xf mongo.tgz
$ ln âs mongodb-linux-i686-2.0.2 MongoDB
For Windows, you can simply download the ZIP file and extract it in a folder. Ensure that you update the</path/to/MongoDB/bin>
in your system path.
Before we start the MongoDB server, it's necessary to configure the path where we want to store our data, the interface to listen on, and so on. All these configurations are stored in mongod.conf
. The default mongod.conf
looks like the following code and is stored at the same location where MongoDB is installedâin our case /usr/local/mongodb:
# Store data in /usr/local/var/mongodb instead of the default /data/db dbpath = /usr/local/var/mongodb # Only accept local connections bind_ip = 127.0.0.1
dbpath
is the location where the data will be stored. Traditionally, this used to be /data/db
but this has changed to /usr/local/var/mongodb
. MongoDB will create this dbpath
if you have not created it already.
bind_ip
is the interface on which the server will run. Don't mess with this entry unless you know what you are doing!
Note
Write-ahead logging is a technique to ensure durability and atomicity in database systems. Before actually writing to the database, the information (such as redo and undo) is written to a log (called the journal). This ensures that recovering from a crash is credible and fast. We
> shall learn more about this in the book.
We can start the MongoDB server using the following command:
$ sudo mongod --config /usr/local/mongodb/mongod.conf
Remember that if we don't give the --config
parameter, the default dbpath
will be taken as /data/db
.
When you start the server, if all is well, you should see something like the following:
$ sudo mongod --config /usr/local/mongodb/mongod.conf
Sat Sep 10 15:46:31 [initandlisten] MongoDB starting : pid=14914 port=27017 dbpath=/usr/local/var/mongodb 64-bit
Sat Sep 10 15:46:31 [initandlisten] db version v2.0.2, pdfile version 4.5
Sat Sep 10 15:46:31 [initandlisten] git version: c206d77e94bc3b65c76681df5a6b605f68a2de05
Sat Sep 10 15:46:31 [initandlisten] build sys info: Darwin erh2.10gen.cc 9.6.0 Darwin Kernel Version 9.6.0: Mon Nov 24 17:37:00 PST 2008; root:xnu-1228.9.59~1/RELEASE_I386 i386 BOOST_LIB_VERSION=1_40
Sat Sep 10 15:46:31 [initandlisten] journal dir=/usr/local/var/mongodb/journal
Sat Sep 10 15:46:31 [initandlisten] recover : no journal files present, no recovery needed
Sat Sep 10 15:46:31 [initandlisten] waiting for connections on port 27017
Sat Sep 10 15:46:31 [websvr] web admin interface listening on port 28017
The preceding process does not terminate as it is running in the foreground! Some explanations are due here:
The server started with
pid 14914
onport 27017
(default port)The MongoDB version is 2.0.2
The journal path is
/usr/local/var/mongodb/journal
(It also mentions that there is no current journal file, as this is the first time we are starting this up!)The web admin port is on
28017
Note
The MongoDB server has some pretty interesting command-line options:-v
is verbose. -vv
is more verbose and -vvv
is even more verbose. Include multiple times for more verbosity!
There are plenty of command line options that allow us to use MongoDB in various ways. For example:
1.
--jsonp
allows JSONP access.2.
--rest
turns on REST API.3. Master/Slave, options, replication options, and even sharing options (We shall see more in Chapter 10, Scaling MongoDB).
Press Ctrl+C if the process is running in the foreground. If it's running as a daemon, it has its standard startup script. On Linux flavors such as Ubuntu, you have upstart scripts that start and stop the mongod
daemon. On Mac, you have launchd
and launchct
commands that can start and stop the daemon. On other flavors of Linux, you would find more of the resource scripts in the /etc/init.d
directory. On Windows, the Services in the Control Panel can control the daemon process.
Along with the MongoDB server binary, there are plenty of other utilities too that help us in administration, monitoring, and management of MongoDB.
Even before we see how to use MongoDB utilities, it's important to know how information is stored. We shall study a lot more of the object model in Chapter 2, Diving Deep into MongoDB.
What is a JavaScript object? Surely you've heard of JavaScript Object Notation (JSON) . MongoDB stores information similar to this. (It's called Binary JSON (BSON) , which we shall read more about in Chapter 3, The MongoDB Internals). BSON, in addition to JSON formats, is ideally suited for "Document" storage. Don't worry, more information on this later!
So, if you want to save information, you simply use the JSON protocol:
{ name : 'Gautam Rege', passion: [ 'Ruby', 'MongoDB' ], company : { name : "Josh Software Private Limited", country : 'India' } }
The previous example shows us how to store information:
String: "" or '' Integer: 10 Float: 10.1 Array: ['1', 2] Hash: {a: 1, b: 2}
The Mongo client utility is used to connect to MongoDB database. Considering that this is a Ruby and MongoDB book, it is a utility that we shall use rarely (because we shall be accessing the database using Ruby). The Mongo CLI client, however, is indeed useful for testing out basics.
We can connect to MongoDB databases in various ways:
$ mongo book
$ mongo 192.168.1.100/book
$ mongo db.myserver.com/book
$ mongo 192.168.1.100:9999/book
In the preceding case, we connect to a database called book
on localhost, on a remote server, or on a remote server on a different port. When you connect to a database, you should see the following:
$ mongo book
MongoDB shell version: 2.0.2
connecting to: book
>
To save data, use the JavaScript object and execute the following command:
> db.shelf.save( { name: 'Gautam Rege',
passion : [ 'Ruby', 'MongoDB']
})
>
The previous command saves the data (that is, usually called "Document") into the collection shelf
. We shall talk more about collections and other terminologies in Chapter 3, MongoDB Internals. A collection can vaguely be compared to tables.
We have various ways to retrieve the previously stored information:
Fetch the first 10 objects from the
book
database (also called a collection), as follows:
> db.shelf.find()
{ "_id" : ObjectId("4e6bb98a26e77d64db8a3e89"), "name" : "Gautam Rege", "passion" : [ "Ruby", MongoDB" ] }
>
Find a specific record of the
name
attribute. This is achieved by executing the following command:
> db.shelf.find( { name : 'Gautam Rege' })
{ "_id" : ObjectId("4e6bb98a26e77d64db8a3e89"), "name" : "Gautam Rege", "passion" : [ "Ruby", MongoDB" ] }
>
So far so good! But you may be wondering what the big deal is. This is similar to a select query I would have fired anyway. Well, here is where things start getting interesting.
Find records by using regular expressions! This is achieved by executing the following command:
$ db.shelf.find( { name : /Rege/ })
{ "_id" : ObjectId("4e6bb98a26e77d64db8a3e89"), "name" : "Gautam Rege", "passion" : [ "Ruby", MongoDB" ] }
>
$ db.shelf.find( { name : /rege/i })
{ "_id" : ObjectId("4e6bb98a26e77d64db8a3e89"), "name" : "Gautam Rege", "passion" : [ "Ruby", MongoDB" ] }
>
As we can see, it's easy when we have programming constructs mixed with database constructs with a dash of regular expressions.
Ever wondered how to extract information from MongoDB? It's mongoexport!
What is pretty cool is that the Mongo data transfer protocol is all in JSON/BSON formats. So what? - you ask. As JSON is now a universally accepted and common format of data transfer, you can actually export the database, or the collection, directly in JSON format â so even your web browser can process data from MongoDB. No more three-tier applications! The opportunities are infinite!
Ok, back to basics. Here is how you can export data from MongoDB:
$ mongoexport -d book -c shelf
connected to: 127.0.0.1
{ "_id" : { "$oid" : "4e6c45b81cb76a67a0363451" }, "name" : "Gautam Rege", "passion" : [ "Ruby", MongoDB" ]}
exported 1 records
This couldn't be simpler, could it? But wait, there's more. You can export this data into a CSV file too!
$ mongoexport -d book -c shelf -f name,passion --csv -o test.csv
The preceding command saves data in a CSV file. Similarly, you can export data as a JSON array too!
$ mongoexport -d book -c shelf --jsonArray
connected to: 127.0.0.1
[{ "_id" : { "$oid" : "4e6c61a05ff70cac810c6996" }, "name" : "Gautam Rege", "passion" : [ "Ruby", "MongoDB" ] }]
exported 1 records
Wasn't this expected? If there is a mongoexport
, you must have a mongoimport!
Imagine when you want to import information; you can do so in a JSON array, CSV, TSV or plain JSON format. Simple and sweet!
Backups are important for any database and MongoDB is no exception. mongodump
dumps the entire database or databases in binary JSON format. We can store this and use this later to restore it from the backup. This is the closest resemblance to mysqldump!
It is done as follows:
$ mongodump -dconfig
connected to: 127.0.0.1
DATABASE: config to dump/config
config.version to dump/config/version.bson
1 objects
config.system.indexes to dump/config/system.indexes.bson
14 objects
...
config.collections to dump/config/collections.bson
1 objects
config.changelog to dump/config/changelog.bson
10 objects
$
$ ls dump/config/
changelog.bson databases.bson mongos.bson system.indexes.bson
chunks.bson lockpings.bson settings.bson version.bson
collections.bson locks.bson shards.bson
Now that we have backed up the database, in case we need to restore it, it is just a matter of supplying the information to mongorestore
, which is done as follows:
$ mongorestore -dbkp1 dump/config/
connected to: 127.0.0.1
dump/config/changelog.bson
going into namespace [bkp1.changelog]
10 objects found
dump/config/chunks.bson
going into namespace [bkp1.chunks]
7 objects found
dump/config/collections.bson
going into namespace [bkp1.collections]
1 objects found
dump/config/databases.bson
going into namespace [bkp1.databases]
15 objects found
dump/config/lockpings.bson
going into namespace [bkp1.lockpings]
5 objects found
...
1 objects found
dump/config/system.indexes.bson
going into namespace [bkp1.system.indexes]
{ key: { _id: 1 }, ns: "bkp1.version", name: "_id_" }
{ key: { _id: 1 }, ns: "bkp1.settings", name: "_id_" }
{ key: { _id: 1 }, ns: "bkp1.chunks", name: "_id_" }
{ key: { ns: 1, min: 1 }, unique: true, ns: "bkp1.chunks", name: "ns_1_min_1" }
...
{ key: { _id: 1 }, ns: "bkp1.databases", name: "_id_" }
{ key: { _id: 1 }, ns: "bkp1.collections", name: "_id_" }
14 objects found
The database should be able to store a large amount of data. Typically, the maximum size of JSON objects stores 4 MB (and in v1.7 onwards, 16 MB). So, can we store videos and other large documents in MongoDB? That is where the mongofiles
utility helps.
MongoDB uses GridFS specification for storing large files. Language bindings are available to store large files. GridFS splits larger files into chunks and maintains all the metadata in the collection. It's interesting to note that GridFS is just a specification, not a mandate and all MongoDB drivers adhere to this voluntarily.
To manage large files directly in a database, we use the mongofiles
utility.
$ mongofiles -d book -c shelf put /home/gautam/Relax.mov
connected to: 127.0.0.1
added file: { _id: ObjectId('4e6c6f9cc7bd0bf42f31aa3b'), filename: "/Users/gautam/Relax.mov", chunkSize: 262144, uploadDate: new Date(1315729317190), md5: "43883ace6022c8c6682881b55e26e745", length: 49120795 }
done!
Notice that 47 MB of data was saved in the database. I wouldn't want to leave you in the dark, so here goes a little bit of explanation. GridFS creates an fs
collection that has two more collections called chunks
and files
. You can retrieve this information from MongoDB from the command line or using Mongo CLI.
$ mongofiles -d book list
connected to: 127.0.0.1
/Users/gautam/Downloads/Relax.mov 49120795
Let's use Mongo CLI to fetch this information now. This can be done as follows:
$ mongo
MongoDB shell version: 1.8.3
connecting to: test
> use book
switched to db book
> db.fs.chunks.count()
188
> db.fs.files.count()
1
> db.fs.files.findOne()
{
"_id" : ObjectId("4e6c6f9cc7bd0bf42f31aa3b"),
"filename" : "/Users/gautam/Downloads/Relax.mov",
"chunkSize" : 262144,
"uploadDate" : ISODate("2011-09-11T08:21:57.190Z"),
"md5" : "43883ace6022c8c6682881b55e26e745",
"length" : 49120795
}
>
This is a utility that helps analyze BSON dumps. For example, if you want to filter all the objects from a BSON dump of the book
database, you could run the following command:
$ bsondump --filter "{name:/Rege/}" dump/book/shelf.bson
This command would analyze the entire dump and get all the objects where name
has the specified value in it! The other very nice feature of bsondump
is if we have a corrupted dump during any restore, we can use the objcheck
flag to ignore all the corrupt objects.
Considering that we aim to do web development with Ruby and MongoDB, Rails or Sinatra cannot be far behind.
Note
Rails 3 packs a punch. Sinatra was born because Rails 2.x was a really heavy framework. However, Rails 3 has Metal that can be configured with only what we need in our application framework. So Rails 3 can be as lightweight as Sinatra and also get the best of the support libraries. So Rails 3 it is, if I have to choose between Ruby web frameworks!
Installing Rails 3 or Sinatra is as simple as one command, as follows:
$ gem install rails
$ gem install sinatra
What we have learned so far is about getting comfortable with Ruby and MongoDB. We installed Ruby using RVM, learned a little about rbenv and then installed MongoDB. We saw how to configure MongoDB, start it, stop it, and finally we played around with the various MongoDB utilities to dump information, restore it, save large files and even export to CSV or JSON.
In the next chapter, we shall dive deep into MongoDB. We shall learn how to work with documents, save them, fetch them, and search for them â all this using the mongo utility. We shall also see a comparison with SQL databases.