Reader small image

You're reading from  Building Enterprise JavaScript Applications

Product typeBook
Published inSep 2018
Reading LevelIntermediate
PublisherPackt
ISBN-139781788477321
Edition1st Edition
Languages
Right arrow
Author (1)
Daniel Li
Daniel Li
author image
Daniel Li

Daniel Li is a full-stack JavaScript developer at Nexmo. Previously, he was also the Managing Director of Brew, a digital agency in Hong Kong that specializes in MeteorJS. A proponent of knowledge-sharing and open source, Daniel has written over 100 blog posts and in-depth tutorials, helping hundreds of thousands of readers navigate the world of JavaScript and the web.
Read more about Daniel Li

Right arrow

Chapter 6. Storing Data in Elasticsearch

In the previous chapter, we developed the bulk of our Create User feature by following a TDD process and writing all our E2E test cases first. The last piece of the puzzle is to actually persist the user data into a database.

In this chapter, we will install and run ElasticSearch on our local development machine, and use it as our database. Then, we will implement our last remaining step definition, using it to drive the development of our application code. Specifically, we will cover the following:

  • Installing Java and Elasticsearch
  • Understanding Elasticsearch concepts, such as indices, types, and documents
  • Using the Elasticsearch JavaScript client to complete our create user endpoint
  • Writing a Bash script to run our E2E tests with a single command

Introduction to Elasticsearch


So, what is Elasticsearch? First and foremost, Elasticsearch should not be viewed as a single, one-dimensional tool. Rather, it's a suite of tools that consists of a distributed database, a full-text search engine, and also an analytics engine. We will focus on the "database" part in this chapter, dealing with the "distributed" and "full-text search" parts later.

At its core, Elasticsearch is a high-level abstraction layer for Apache Lucene, a full-text search engine. Lucene is arguably the most powerful full-text search engine around; it is used by Apache Solr, another search platform similar to Elasticsearch. However, Lucene is very complex and the barrier to entry is high; thus Elasticsearch abstracts that complexity away into a RESTful API.

 

Instead of using Java to interact with Lucene directly, we can instead send HTTP requests to the API. Furthermore, Elasticsearch also provides many language-specific clients that abstract the API further into nicely-packaged...

Installing Java and Elasticsearch


First, let's install Elasticsearch and its dependencies. Apache Lucene and Elasticsearch are both written in Java, and so we must first install Java.

 

Installing Java

When you install Java, it usually means one of two things: you are installing the Java Runtime Environment (JRE) or the Java Development Kit (JDK). The JRE provides the runtime that allows you to run Java programs, whereas the JDK contains the JRE, as well as other tools, that allow you to develop in Java.

We are going to install the JDK here, but to complicate things further, there are different implementations of the JDK—OpenJDK, Oracle Java, IBM Java—and the one we will be using is the default-jdk APT package, which comes with our Ubuntu installation:

$ sudo apt update
$ sudo apt install default-jdk

Next, we need to set a system-wide environment variable so that other programs using Java (for example, Elasticsearch) know where to find it. Run the following command to get a list of Java installations...

Understanding key concepts in Elasticsearch


We will be sending queries to Elasticsearch very shortly, but it helps if we understand a few basic concepts.

 

 

Elasticsearch is a JSON document store

As you might have noticed from the response body of our API call, Elasticsearch stores data in JavaScript Object Notation (JSON) format. This allows developers to store objects with more complex (often nested) structures when compared to relational databases that impose a flat structure with rows and tables.

That's not to say document databases are better than relational databases, or vice versa; they are different and their suitability depends on their use.

Document vs. relationship data storage

For example, your application may be a school directory, storing information about schools, users (including teachers, staff, parents, and students), exams, classrooms, classes, and their relations with each other. Given that the data structure can be kept relatively flat (that is, mostly simple key-value entries...

Querying Elasticsearch from E2E tests


We now have all the required knowledge in Elasticsearch to implement our last undefined step definition, which reads from the database to see if our user document has been indexed correctly. We will be using the JavaScript client, which is merely a wrapper around the REST API, with a one-to-one mapping to its endpoints. So first, let's install it:

$ yarn add elasticsearch

Next, import the package into our spec/cucumber/steps/index.js file and create an instance of elasticsearch.Client:

const client = new elasticsearch.Client({
  host: `${process.env.ELASTICSEARCH_PROTOCOL}://${process.env.ELASTICSEARCH_HOSTNAME}:${process.env.ELASTICSEARCH_PORT}`,
});

By default, Elasticsearch runs on port 9200. However, to avoid hard-coded values, we have explicitly passed in an options object, specifying the host option, which takes its value from the environment variables. To make this work, add these environment variables to our .env and .env.example files:

ELASTICSEARCH_PROTOCOL...

Indexing documents to Elasticsearch


In src/index.js, import the Elasticsearch library and initiate a client as we did before; then, in the request handler for POST /users, use the Elasticsearch JavaScript client's index method to add the payload object into the Elasticsearch index:

import elasticsearch from 'elasticsearch';
const client = new elasticsearch.Client({
  host: `${process.env.ELASTICSEARCH_PROTOCOL}://${process.env.ELASTICSEARCH_HOSTNAME}:${process.env.ELASTICSEARCH_PORT}`,
});
...

app.post('/users', (req, res, next) => {
  ...
  client.index({
    index: 'hobnob',
    type: 'user',
    body: req.body
  })
}

The index method returns a promise, which should resolve to something similar to this:

{ _index: 'hobnob',
  _type: 'users',
  _id: 'AV7HyAlRmIBlG9P7rgWY',
  _version: 1,
  result: 'created',
  _shards: { total: 2, successful: 1, failed: 0 },
  created: true }

The only useful and relevant piece of information we can return to the client is the newly auto-generated _id field...

Cleaning up after our tests


When we run our tests, it'll index user documents into our local development database. Over many runs, our database will be filled with a large number of test user documents. Ideally, we want all our tests to be self-contained. This means with each test run, we should reset the state of the database back to the state before the test was run. To achieve this, we must make two further changes to our test code:

  • Delete the test user after we have made the necessary assertions
  • Run the tests on a test database; in the case of Elasticsearch, we can simply use a different index for our tests

Deleting our test user

First, add a new entry to the list of features in the Cucumber specification:

...
And the payload of the response should be a string
And the payload object should be added to the database, grouped under the "user" type
And the newly-created user should be deleted

Next, define the corresponding step definition for this step. But first, we are going to modify the step...

Summary


In this chapter, we continued our work on the Create User endpoint. Specifically, we implemented the success scenario by persisting data into Elasticsearch. Then, we refactored our testing workflow by creating a Bash script that automatically loads up all dependencies before running our tests.

In the next chapter, we will refactor our code further, by breaking it down into smaller units, and covering them with unit and integration tests, written using Mocha, Chai, and Sinon. We will also continue to implement the rest of the endpoints, making sure we follow good API design principles.

lock icon
The rest of the chapter is locked
You have been reading a chapter from
Building Enterprise JavaScript Applications
Published in: Sep 2018Publisher: PacktISBN-13: 9781788477321
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
undefined
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at €14.99/month. Cancel anytime

Author (1)

author image
Daniel Li

Daniel Li is a full-stack JavaScript developer at Nexmo. Previously, he was also the Managing Director of Brew, a digital agency in Hong Kong that specializes in MeteorJS. A proponent of knowledge-sharing and open source, Daniel has written over 100 blog posts and in-depth tutorials, helping hundreds of thousands of readers navigate the world of JavaScript and the web.
Read more about Daniel Li