Reader small image

You're reading from  Advanced Elasticsearch 7.0

Product typeBook
Published inAug 2019
Reading LevelBeginner
PublisherPackt
ISBN-139781789957754
Edition1st Edition
Languages
Right arrow
Author (1)
Wai Tak Wong
Wai Tak Wong
author image
Wai Tak Wong

Wai Tak Wong is a faculty member in the Department of Computer Science at Kean University, NJ, USA. He has more than 15 years professional experience in cloud software design and development. His PhD in computer science was obtained at NJIT, NJ, USA. Wai Tak has served as an associate professor in the Information Management Department of Chung Hua University, Taiwan. A co-founder of Shanghai Shellshellfish Information Technology, Wai Tak acted as the Chief Scientist of the R&D team, and he has published more than a dozen algorithms in prestigious journals and conferences. Wai Tak began his search and analytics technology career with Elasticsearch in the real estate market and later applied this to data management and FinTech data services.
Read more about Wai Tak Wong

Right arrow

Using Elasticsearch for Exploratory Data Analysis

In the previous chapter, we learned how to preprocess documents by using ingest pipeline processors before indexing operations. We've looked at all Ingest APIs and learned how to use the processors. We were also involved in an in-depth discussion of conditional execution and error handling.

In this chapter, we'll use a powerful tool, the Aggregation Framework, to perform data analysis. According to the definition from the Information Technology Laboratory (ITL) at the National Institute of Standards and Technology (NIST) (https://www.itl.nist.gov/div898/handbook/eda/section1/eda11.htm), Exploratory Data Analysis (EDA) is an approach to carrying out data analysis by allowing the data to reveal its underlying structure and model. We'll try to use a few examples to illustrate EDA.

By the end of this chapter, we will...

Business analytics

The general concept of business analytics is to measure past business performance by using a combination of skills, methods, and techniques to gain insight into decisions when planning for the future of the business. Elasticsearch can provide data-driven insights to help solve problems and improve efficiency. Let's take an example to investigate closing-price changes by using the Morningstar category of commission-free ETF. Recall that, in the documentation of the cf_etf_hist_price index, introduced in Chapter 8, Aggregations Framework, has only a symbol field. There is no way to group the documents into such a category using only the cf_etf_hist_price index unless we manually attach the Morningstar_category field during the indexing operation. Of course, this can be solved programmatically. However, we will solve it by using only Elasticsearch with the...

Operational data analytics

Many professionals in the industry use the term operational data analytics to refer to the real-time observation of business processes. In the analytics world, operational data analytics are used to examine the latest information that businesses encounter every day, and then making the appropriate adjustments and proposing an instant solution for change. Regarding the technical stock price and volatility indicators, Bollinger Bands are a popular analysis tool to inform daily trading decisions. The band is composed of three different lines, with two standard deviations (positive and negative) away from simple moving averages (SMA). Volatility is based on standard deviation. As the volatility increases, the band widens, and vice versa. The formula for Bollinger Bands is described in the following code block. Interested readers can take a look at the reference...

Sentiment analysis

Sentiment analysis is a research topic that analyzes opinions, attitudes, and emotions expressed in a given text. The methodology is to identify and extract subjective information by using context-mining techniques. The general purpose is to judge whether the potential emotions expressed are positive, negative, or neutral based on the source material. Many techniques, such as natural language processing (NLP), text analysis, computational linguistics, statistics, machine learning, and even biometrics, can be applied to sentiment analysis. So far, most users use Elasticsearch as the data store in sentiment analysis and the subsequent search or metric analysis. The workload for sentiment analysis is taken care of by third-party libraries. The following table introduces the two most commonly used libraries:

Name Programming
language
Description
TextBlob Python...

Summary

Wonderful! We have completed a comprehensive discussion of EDA. We demonstrated how to find symbolic momentum with simple financial analysis to inform business strategies. We also provided step-by-step instructions to compute Bollinger Bands using daily operational data. Finally, we conducted a brief survey of sentiment analysis with Elasticsearch.

In this next section, we will cover the Java High Level REST Client and Java Low Level REST Client. The REST clients take care of all serialization and deserialization of the request and response objects, making the development work easy. We'll also explore the basics of Spring Data with Elasticsearch. We'll show how to use the relevant APIs for indexing, searching, and querying.

lock icon
The rest of the chapter is locked
You have been reading a chapter from
Advanced Elasticsearch 7.0
Published in: Aug 2019Publisher: PacktISBN-13: 9781789957754
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
undefined
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $15.99/month. Cancel anytime

Author (1)

author image
Wai Tak Wong

Wai Tak Wong is a faculty member in the Department of Computer Science at Kean University, NJ, USA. He has more than 15 years professional experience in cloud software design and development. His PhD in computer science was obtained at NJIT, NJ, USA. Wai Tak has served as an associate professor in the Information Management Department of Chung Hua University, Taiwan. A co-founder of Shanghai Shellshellfish Information Technology, Wai Tak acted as the Chief Scientist of the R&D team, and he has published more than a dozen algorithms in prestigious journals and conferences. Wai Tak began his search and analytics technology career with Elasticsearch in the real estate market and later applied this to data management and FinTech data services.
Read more about Wai Tak Wong