Reader small image

You're reading from  Seven NoSQL Databases in a Week

Product typeBook
Published inMar 2018
PublisherPackt
ISBN-139781787288867
Edition1st Edition
Right arrow
Authors (2):
Sudarshan Kadambi
Sudarshan Kadambi
author image
Sudarshan Kadambi

Sudarshan has a background in Distributed systems and Database design. He has been a user and contributor to various NoSQL databases and is passionate about solving large-scale data management challenges.
Read more about Sudarshan Kadambi

Xun (Brian) Wu
Xun (Brian) Wu
author image
Xun (Brian) Wu

Xun (Brian) Wu is a senior blockchain architect and consultant. With over 20 years of hands-on experience across various technologies, including Blockchain, big data, cloud, AI, systems, and infrastructure, Brian has worked on more than 50 projects in his career. He has authored nine books, which have been published by O'Reilly, Packt, and Apress, focusing on popular fields within the Blockchain industry. The titles of his books include: Learn Ethereum (First Edition), Learn Ethereum (Second Edition), Blockchain for Teens, Hands-On Smart Contract Development with Hyperledger Fabric V2, Hyperledger Cookbook, Blockchain Quick Start Guide, Security Tokens and Stablecoins Quick Start Guide, Blockchain by Example, and Seven NoSQL Databases in a Week.
Read more about Xun (Brian) Wu

View More author details
Right arrow

Chapter 8. InfluxDB

The term big data is everywhere these days, has now entered the mainstream, and is also merging with traditional analytics. More electronic devices than ever before are connected to the internet, phones, watches, sensors, cars, TVs, and so on. These devices generate enormous amounts of new, unstructured real-time data every minute. Analyzing time-structured data has become the most important problem across many industries. Many companies are looking for a new way to solve their time-series data problems and have utilized their available influx data. As a result, the popularity of the time-series database has rapidly increased over the past few years. InfluxDB is one of the most popular time-series databases in this area.

In this chapter, we will cover the following topics:

  • What is InfluxDB?
  • Installation and configuration
  • Query language and API
  • InfluxDB ecosystem
  • InfluxDB operations

Introduction to InfluxDB


InfluxDB is developed by InfluxData. It is an open source, big data, NoSQL database that allows for massive scalability, high availability, fast write, and fast read. As a NoSQL, InfluxDB stores time-series data, which has a series of data points over time. These data points can be regular or irregular type based on the type of data resource. Some regular data measurements are based on a fixed interval time, for example, system heartbeat monitoring data. Other data measurements could be based on a discrete event, for example, trading transaction data, sensor data, and so on.

InfluxDB is written on the go; this makes it easy to compile and deploy without external dependencies. It offers an SQL-like query language. The plug-in architecture design makes it very flexible to integrate other third-party products.

Like other NoSQL databases, it supports different clients such as Go, Java, Python, and Node.js to interact with the database. The convenience HTTP native API can...

Installation and configuration


In this section, we will discuss how to install InfluxDB and set up InfluxDB configuration.

Installing InfluxDB

To install InfluxDB, the official installation guide can be found here: https://docs.influxdata.com/influxdb/v1.5/introduction/installation/.

Ubuntu is built from the Debian distribution. In this chapter, we use Ubuntu as the lab environment to run InfluxDB. Here is a link to install Ubuntu in a VirtualBox: https://askubuntu.com/questions/142549/how-to-install-ubuntu-on-virtualbox.

Once Ubuntu is installed in your VM, we will install InfluxDB in Ubuntu. We use the apt-get package manager to install InfluxDB. Enter the following five commands:

curl -sL https://repos.influxdata.com/InfluxDB.key | sudo apt-key add -
source /etc/lsb-release
echo "deb https://repos.influxdata.com/${DISTRIB_ID,,} ${DISTRIB_CODENAME} stable" | sudo tee /etc/apt/sources.list.d/InfluxDB.list
sudo apt-get update && sudo apt-get install influxdb
sudo systemctl start influxdb...

Query language and API


In this section, we will discuss InfluxDB query language and how to use InfluxDB API.

Query language

InfluxDB provides an SQL-like query language; it is used for querying time-series data. It also supports HTTP APIs for write and performs admin-related work.

Let's use the InfluxDB CLI tool to connect to an InfluxDB instance and run some queries.

Start and connect to the InfluxDB instance by typing the following commands:

sudo service InfluxDB start
$ influx -precision rfc3339

By default, InfluxDB shows the time as nanosecond UTC value, it is a very long number, like 1511815800000000000. The argument -precision rfc3339 is for the display time field as a human readable format - YYYY-MM-DDTHH:MM:SS.nnnnnnnnnZ:

Connected to http://localhost:8086 version 1.5
InfluxDB shell version: 1.5
>

We can check available databases by using the show databases function:

> show databases;
name: databases
name
----
_internal

To use the command to switch to an existing database, you can type...

InfluxDB ecosystem


InfluxDB is a NoSQL database. In many real-world projects, it typically needs to develop data collect applications to collect and send data to the process engine, and then the process engine will process the collected matrix to save in the database. Fortunately, InfluxDB provides this kind of ecosystem to make development much easier. In typical InfluxDB ecosystem components, Telegraf is the agent to collect and send data. Kapacitor is a real-time streaming data process engine. Chronograf is a dashboard tool and is used for visualizing time-series data. In this section, we will discuss Telegraf and Kapacitor:

Telegraf

Telegraf is a plugin-driven agent for collecting, processing, aggregating, reporting, and writing matrix. It has more than 100 plugins. It is written in Go and compiled as a standalone library; it doesn't have external dependency. The plugin development is easy. You can write your own plugins. This plugin-driven architecture can easily fit into your application...

InfluxDB operations


In this section, we will discuss some InfluxDB operations, such as how to back up and restore data, what is the RP, how to monitor InfluxDB, clustering, and HA.

Backup and restore

It is critical to backup your data and recover them in case problem occurs, such as system crashes and hardware failures. InfluxDB provides a variety of backup and restore strategies.

Backups

Backup is a must in every production database. There are two types of backups in InfluxDB: metastore and database.

Metastore contains system information. You can back up a metastore instance by running the following command:

influxd backup <path-to-backup>

When backing up databases, each database needs to be backed up separately by running the following command:

influxd backup -database <mydatabase> <path-to-backup>

You can specify some arguments for retention, shard, and since as follows:

-retention <retention policy name> -shard <shard ID> -since <date>

If we change the <path...

Summary


In this chapter, we introduced the concept of InfluxDB, how to install InfluxDB, and set up the configuration.

We also learned InfluxDB query language, HTTP API, and client API. You saw how to use Kapacitor and Telegraf to monitor system logs with InfluxDB. Finally, we discussed InfluxDB operations.

InfluxDB is an excellent choice for time-series based data. It provides efficient collection of data with query flexibility. Multiple language supports provide easy integration with many enterprise applications. With the TSM data storage engine, InfluxDB provides high throughput batch read and write performance. More and more plugins continually add in the ecosystem component and make it easy to use in many real-world projects and become popular.

lock icon
The rest of the chapter is locked
You have been reading a chapter from
Seven NoSQL Databases in a Week
Published in: Mar 2018Publisher: PacktISBN-13: 9781787288867
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
undefined
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $15.99/month. Cancel anytime

Authors (2)

author image
Sudarshan Kadambi

Sudarshan has a background in Distributed systems and Database design. He has been a user and contributor to various NoSQL databases and is passionate about solving large-scale data management challenges.
Read more about Sudarshan Kadambi

author image
Xun (Brian) Wu

Xun (Brian) Wu is a senior blockchain architect and consultant. With over 20 years of hands-on experience across various technologies, including Blockchain, big data, cloud, AI, systems, and infrastructure, Brian has worked on more than 50 projects in his career. He has authored nine books, which have been published by O'Reilly, Packt, and Apress, focusing on popular fields within the Blockchain industry. The titles of his books include: Learn Ethereum (First Edition), Learn Ethereum (Second Edition), Blockchain for Teens, Hands-On Smart Contract Development with Hyperledger Fabric V2, Hyperledger Cookbook, Blockchain Quick Start Guide, Security Tokens and Stablecoins Quick Start Guide, Blockchain by Example, and Seven NoSQL Databases in a Week.
Read more about Xun (Brian) Wu