Reader small image

You're reading from  Data Ingestion with Python Cookbook

Product typeBook
Published inMay 2023
PublisherPackt
ISBN-139781837632602
Edition1st Edition
Right arrow
Author (1)
Gláucia Esppenchutz
Gláucia Esppenchutz
author image
Gláucia Esppenchutz

Gláucia Esppenchutz is a data engineer with expertise in managing data pipelines and vast amounts of data using cloud and on-premises technologies. She worked in companies such as Globo, BMW Group, and Cloudera. Currently, she works at AiFi, specializing in the field of data operations for autonomous systems. She comes from the biomedical field and shifted her career ten years ago to chase the dream of working closely with technology and data. She is in constant contact with the open source community, mentoring people and helping to manage projects, and has collaborated with the Apache, PyLadies group, FreeCodeCamp, Udacity, and MentorColor communities.
Read more about Gláucia Esppenchutz

Right arrow

Configuring Docker for MongoDB

MongoDB is a Not Only SQL (NoSQL) document-oriented database, widely used to store Internet of Things (IoT) data, application logs, and so on. A NoSQL database is a non-relational database that stores unstructured data differently from relational databases such as MySQL or PostgreSQL. Don’t worry too much about this now; we will cover it in more detail in Chapter 5.

Your cluster production environment can handle huge amounts of data and create resilient data storage.

Getting ready

Following the good practice of code organization, let’s start creating a folder inside our project to store the Docker image:

Create a folder inside our project directory to store the MongoDB Docker image and data as follows:

my-project$ mkdir mongo-local
my-project$ cd mongo-local

How to do it…

Here are the steps to try out this recipe:

  1. First, we pull the Docker image from Docker Hub as follows:
    my-project/mongo-local$ docker pull mongo

You should see the following message in your command line:

Using default tag: latest
latest: Pulling from library/mongo
(...)
bc8341d9c8d5: Pull complete
(...)
Status: Downloaded newer image for mongo:latest
docker.io/library/mongo:latest

Note

If you are a WSL user, an error might occur if you use the WSL 1 version instead of version 2. You can easily fix this by following the steps here: https://learn.microsoft.com/en-us/windows/wsl/install.

  1. Then, we run the MongoDB server as follows:
    my-project/mongo-local$ docker run \
    --name mongodb-local \
    -p 27017:27017 \
    -e MONGO_INITDB_ROOT_USERNAME="your_username" \
    -e MONGO_INITDB_ROOT_PASSWORD="your_password"\
    -d mongo:latest

We then check our server. To do this, we can use the command line to see which Docker images are running:

my-project/mongo-local$ docker ps

We then see this on the screen:

Figure 1.5 – MongoDB and Docker running container

Figure 1.5 – MongoDB and Docker running container

We can even check on the Docker Desktop application to see whether our container is running:

Figure 1.6 – The Docker Desktop vision of the MongoDB container running

Figure 1.6 – The Docker Desktop vision of the MongoDB container running

  1. Finally, we need to stop our container. We need to use Container ID to stop the container, which we previously saw when checking the Docker running images. We will rerun it in Chapter 5:
    my-project/mongo-local$ docker stop 427cc2e5d40e

How it works…

MongoDB’s architecture uses the concept of distributed processing, where the main node interacts with clients’ requests, such as queries and document manipulation. It distributes the requests automatically among its shards, which are a subset of a larger data collection here.

Figure 1.7 – MongoDB architecture

Figure 1.7 – MongoDB architecture

Since we may also have other running projects or software applications inside our machine, isolating any database or application server used in development is a good practice. In this way, we ensure nothing interferes with our local servers, and the debug process can be more manageable.

This Docker image setting creates a MongoDB server locally and even allows us to make additional changes if we want to simulate any other scenario for testing or development.

The commands we used are as follows:

  • The --name command defines the name we give to our container.
  • The -p command specifies the port our container will open so that we can access it via localhost:27017.
  • -e command defines the environment variables. In this case, we set the root username and password for our MongoDB container.
  • -d is detached mode – that is, the Docker process will run in the background, and we will not see input or output. However, we can still use docker ps to check the container status.
  • mongo:latest indicates Docker pulling this image’s latest version.

There’s more…

For frequent users, manually configuring other parameters for the MongoDB container, such as the version, image port, database name, and database credentials, is also possible.

A version of this image with example values is also available as a docker-compose file in the official documentation here: https://hub.docker.com/_/mongo.

The docker-compose file for MongoDB looks similar to this:

# Use your own values for username and password
version: '3.1'
services:
  mongo:
    image: mongo
    restart: always
    environment:
      MONGO_INITDB_ROOT_USERNAME: root
      MONGO_INITDB_ROOT_PASSWORD: example
  mongo-express:
    image: mongo-express
    restart: always
    ports:
      - 8081:8081
    environment:
      ME_CONFIG_MONGODB_ADMINUSERNAME: root
      ME_CONFIG_MONGODB_ADMINPASSWORD: example
      ME_CONFIG_MONGODB_URL: mongodb://root:example@mongo:27017/

See also

You can check out MongoDB at the complete Docker Hub documentation here: https://hub.docker.com/_/mongo.

Previous PageNext Page
You have been reading a chapter from
Data Ingestion with Python Cookbook
Published in: May 2023Publisher: PacktISBN-13: 9781837632602
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
undefined
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $15.99/month. Cancel anytime

Author (1)

author image
Gláucia Esppenchutz

Gláucia Esppenchutz is a data engineer with expertise in managing data pipelines and vast amounts of data using cloud and on-premises technologies. She worked in companies such as Globo, BMW Group, and Cloudera. Currently, she works at AiFi, specializing in the field of data operations for autonomous systems. She comes from the biomedical field and shifted her career ten years ago to chase the dream of working closely with technology and data. She is in constant contact with the open source community, mentoring people and helping to manage projects, and has collaborated with the Apache, PyLadies group, FreeCodeCamp, Udacity, and MentorColor communities.
Read more about Gláucia Esppenchutz