Reader small image

You're reading from  Advanced Elasticsearch 7.0

Product typeBook
Published inAug 2019
Reading LevelBeginner
PublisherPackt
ISBN-139781789957754
Edition1st Edition
Languages
Right arrow
Author (1)
Wai Tak Wong
Wai Tak Wong
author image
Wai Tak Wong

Wai Tak Wong is a faculty member in the Department of Computer Science at Kean University, NJ, USA. He has more than 15 years professional experience in cloud software design and development. His PhD in computer science was obtained at NJIT, NJ, USA. Wai Tak has served as an associate professor in the Information Management Department of Chung Hua University, Taiwan. A co-founder of Shanghai Shellshellfish Information Technology, Wai Tak acted as the Chief Scientist of the R&D team, and he has published more than a dozen algorithms in prestigious journals and conferences. Wai Tak began his search and analytics technology career with Elasticsearch in the real estate market and later applied this to data management and FinTech data services.
Read more about Wai Tak Wong

Right arrow

Working with the Smart Chinese Analysis plugin

The Smart Chinese Analysis plugin integrates Lucene's Smart Chinese analysis module into Elasticsearch for analyzing Chinese or mixed Chinese-English text. The supported analyzer uses probability knowledge based on a hidden Markov model on a large training corpus to find the optimal word segmentation for Simplified Chinese text. The strategy it uses is to first break the input text into sentences and then perform segmentation in a sentence to obtain words. This plugin provides an analyzer, which is called the smartcn analyzer, and a tokenizer called smartcn_tokenizer. Note that both cannot be configured with any parameter.

To install the smartcn Analysis plugin in the Elasticsearch Docker container, use the commands shown in the following screenshot. We then restart the container to make the plugin effective:

...
lock icon
The rest of the page is locked
Previous PageNext Page
You have been reading a chapter from
Advanced Elasticsearch 7.0
Published in: Aug 2019Publisher: PacktISBN-13: 9781789957754

Author (1)

author image
Wai Tak Wong

Wai Tak Wong is a faculty member in the Department of Computer Science at Kean University, NJ, USA. He has more than 15 years professional experience in cloud software design and development. His PhD in computer science was obtained at NJIT, NJ, USA. Wai Tak has served as an associate professor in the Information Management Department of Chung Hua University, Taiwan. A co-founder of Shanghai Shellshellfish Information Technology, Wai Tak acted as the Chief Scientist of the R&D team, and he has published more than a dozen algorithms in prestigious journals and conferences. Wai Tak began his search and analytics technology career with Elasticsearch in the real estate market and later applied this to data management and FinTech data services.
Read more about Wai Tak Wong