All Products

Best Sellers

New Releases

Books

Videos

Audiobooks

Learning Hub

Newsletters

Free Learning

You're reading from Advanced Elasticsearch 7.0

Product type Book

Published in Aug 2019

Publisher Packt

ISBN-13 9781789957754

Pages 560 pages

Edition 1st Edition

Languages

Java

Concepts

Enterprise Search

Author (1):

Wai Tak Wong

Table of Contents (25) Chapters

Preface

1. Section 1: Fundamentals and Core APIs

2. Overview of Elasticsearch 7

3. Index APIs

4. Document APIs

5. Mapping APIs

6. Anatomy of an Analyzer

7. Search APIs

8. Section 2: Data Modeling, Aggregations Framework, Pipeline, and Data Analytics

9. Modeling Your Data in the Real World

10. Aggregation Frameworks

11. Preprocessing Documents in Ingest Pipelines

12. Using Elasticsearch for Exploratory Data Analysis

13. Section 3: Programming with the Elasticsearch Client

14. Elasticsearch from Java Programming

15. Elasticsearch from Python Programming

16. Section 4: Elastic Stack

17. Using Kibana, Logstash, and Beats

18. Working with Elasticsearch SQL

19. Working with Elasticsearch Analysis Plugins

20. Section 5: Advanced Features

21. Machine Learning with Elasticsearch

22. Spark and Elasticsearch for Real-Time Analytics

23. Building Analytics RESTful Services

24. Other Books You May Enjoy

Leave a review - let other readers know what you think

Tokenizers

The tokenizer in the analyzer receives the output character stream from the character filters and splits this into a token stream, which is the input to the token filter. Three types of tokenizer are supported in Elasticsearch, and they are described as follows:

Word-oriented tokenizer: This splits the character stream into individual tokens.
Partial word tokenizer: This splits the character stream into a sequence of characters within a given length.
Structured text tokenizer: This splits the character stream into known structured tokens such as keywords, email addresses, and zip codes.

We'll give an example for each built-in tokenizer and compile the results into the following tables. Let's first take a look at the Word-oriented tokenizer:

Word-oriented tokenizer
Tokenizer
`standard`	Input text	`"POST https://api.iextrading.com/1.0/stock/acwf...`

The rest of the chapter is locked

A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.

Unlock this book and the full library FREE for 7 days

Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of

Start free trial

Renews at $15.99/month. Cancel anytime}

Authors (1)

Wai Tak Wong

Wai Tak Wong is a faculty member in the Department of Computer Science at Kean University, NJ, USA. He has more than 15 years professional experience in cloud software design and development. His PhD in computer science was obtained at NJIT, NJ, USA. Wai Tak has served as an associate professor in the Information Management Department of Chung Hua University, Taiwan. A co-founder of Shanghai Shellshellfish Information Technology, Wai Tak acted as the Chief Scientist of the R&D team, and he has published more than a dozen algorithms in prestigious journals and conferences. Wai Tak began his search and analytics technology career with Elasticsearch in the real estate market and later applied this to data management and FinTech data services.

See other products by Wai Tak Wong