2. Documents and Data Types
This chapter introduces you to MongoDB documents, their structure, and data types. For those who are new to the JSON model, this chapter will also serve as a short introduction to JSON. You will identify the basic concepts and data types of JSON documents and compare the document-based storage of MongoDB with the tabular storage of relational databases. You will learn how to represent complex data structures in MongoDB using embedded objects and arrays. By the end of this chapter, you will understand the need for precautionary limits and restrictions on MongoDB documents.
In the previous chapter, we learned how MongoDB, as a NoSQL database, differs from traditional relational databases. We covered the basic features of MongoDB, including its architecture, its different versions, and MongoDB Atlas.
MongoDB is designed for modern-world applications. We live in a world where requirements change rapidly. We want to build lightweight and flexible applications that can quickly adapt to these new requirements and ship them to production as quickly as possible. We want our databases to become agile so that they can adapt to the ever-changing needs of our applications, reduce downtime, scale out easily, and perform efficiently. MongoDB is a perfect fit for all such needs.
Introduction to JSON
The JSON specification became a standard in 2013. If you have been developing applications for a while, you might have seen the transition of applications from XML to JSON. JSON offers a human-readable, plain-text way of representing data. In comparison to XML, where information is wrapped inside tags, and lots of tags make it look bulky, JSON offers a compact and natural format where you can easily focus on the information.
To read or write information in JSON or XML format, the programming languages use their respective parsers. As XML documents are bound by schema definitions and tag library definitions...
When you work with MongoDB using database clients such as mongo shell, MongoDB Compass, or the Collections Browser in Mongo Atlas, you always see the documents in human readable JSON format. However, internally, MongoDB documents are stored in a binary format called BSON. BSON documents are not human-readable, and you will never have to deal with them directly. Before we explore MongoDB documents in detail, let's have a quick overview of the BSON features that benefit the MongoDB document structure.
Like JSON, BSON was introduced in 2009 by MongoDB. Although it was invented by MongoDB, many other systems also use it as a format for data storage or transportation. BSON specifications are primarily based on JSON as they inherit all the good features of JSON, such as the syntax and flexibility. It also provides a few additional features, which are specifically designed for improving storage efficiency, ease of traversal, and a few data type enhancements to avoid the type...
A MongoDB database is composed of collections and documents. A database can have one or more collections, and each collection can store one or more related BSON documents. In comparison to RDBMS, collections are analogous to tables and documents are analogous to rows within a table. However, documents are much more flexible compared with the rows in a table.
RDBMSes consist of a tabular data model that comprises rows and columns. However, your applications may need to support more complex data structures, such as a nested object or a collection of objects. Tabular databases restrict the storage of such complex data structures. In such cases, you will have to split your data into multiple tables and change the application's object structures accordingly. On the other hand, the document-based data model of MongoDB allows your application to store and retrieve more complex object structures due to the flexible JSON-like format of the documents.
MongoDB Data Types
You have learned how MongoDB stores JSON-like documents. You have also seen various documents and read the information stored within them and seen how flexible these documents are to store different types of data structures, irrespective of the complexity of your data.
In this section, you will learn about the various data types supported by MongoDB's BSON documents. Using the right data types in your documents is very important as correct data types help you use the database features more effectively, avoid data corruption, and improve data usability. MongoDB supports all the data types from JSON and BSON. Let's look at each in detail, with examples.
A string is a basic data type used to represent text-based fields in a document. It is a plain sequence of characters. In MongoDB, the string fields are UTF-8 encoded, and thus they support most international characters. The MongoDB drivers for various programming languages convert the string...
Limits and Restrictions on Documents
So far, we have discussed the importance and benefits of using documents. Documents play a major role in building efficient applications, and they improve overall data usability. We know how documents offer a flexible way to represent data in its most natural form. They are often self-contained and can hold a complete unit of information. The self-containment comes from nested objects and arrays.
To use any database effectively, it is important to have the correct data structure. The incorrect data structures you build today may result in lots of pain in the future. In the long term, as your application's usage grows, the amount of data also grows, and the problems that seemed very small initially become more evident. Then comes the obvious question: how do you know whether your data structure is correct?
Your application will tell you the answer. If, to access a certain piece of information, your application must execute multiple queries...
Field Name Rules
MongoDB has a few rules about document field names, which are listed as follows:
- The field name cannot contain a null character.
- Only the fields in an array or an embedded document can have a name starting with the dollar sign (
$). For the top-level fields, the name cannot start with a dollar (
- Documents with duplicate field names are not supported. According to the MongoDB documentation, when a document with duplicate field names is inserted, no error will be thrown, but the document won't be inserted. Even the drivers will drop the documents silently. On the mongo shell, however, if such a document is inserted, it gets inserted correctly. However, the resulting document will have only the second field. That means the second occurrence of the field overwrites the value of the first.
MongoDB (as of version 4.2.8) does not recommend field names starting with a dollar (
$) sign or a dot (
.). The MongoDB query language may not work correctly...
In this chapter, we have covered a detailed structure of MongoDB documents and document-based models, which is important before we dive into more advanced concepts in the upcoming chapters. We began our discussion with the transportation and storage of information in the form of JSON-like documents that provide a flexible and language-independent format. We studied an overview of JSON documents, the document structure, and basic data types, followed by BSON document specifications and differentiating between BSON and JSON on various parameters.
We then covered MongoDB documents, considering their flexibility, self-containment, relatability, and agility, as well as various data types provided by BSON. Finally, we made a note of MongoDB's limitations and restrictions for documents and learned why the limitations are imposed and why they are important.
In the next chapter, we will use the mongo shell and Mongo Compass to connect to an actual MongoDB server and manage...