Packt+ | Advance your knowledge in tech

You're reading from ElasticSearch Cookbook

Product typeBook

Published inDec 2013

Reading LevelBeginner

PublisherPackt

ISBN-139781782166627

Edition1st Edition

Languages

Java

Tools

Elasticsearch

Concepts

Enterprise Search

Author (1)

Alberto Paro

Chapter 11. Python Integration

In this chapter, we will cover the following topics:

Creating a client
Managing indices
Managing mappings
Managing documents
Executing a standard search
Executing a facet search

Introduction

In the previous chapter, we saw how it is possible to use a native client for accessing the ElasticSearch server via Java. This chapter is dedicated to Python language and how to manage common tasks via its clients.

As well as Java, ElasticSearch team supports official clients for Perl, PHP, Python, and Ruby (refer to the announcement post on ElasticSearch blog at http://www.elasticsearch.org/blog/unleash-the-clients-ruby-python-php-perl/). They are pretty new as their initial public release was in September 2013. These clients have the following advantages against other implementations:

They are strongly tied to the ElasticSearch API. ElasticSearch team says These clients are direct translations of the native ElasticSearch REST interface.
They handle dynamic node detection and failover. They are built with a strong networking base for communicating with the cluster.
They have a full coverage of the REST API.
They share the same application approach for every language in which they...

Creating a client

The official ElasticSearch clients are designed to use several transport layers. They allow using the HTTP, thrift or memcached protocol without changing your application code.

The thrift and memcached protocols are binary ones and due to their structures they are generally a bit faster than the HTTP one. They wrap the REST API and share the same behavior so that switching between protocols is very easy.

In this recipe, we'll see how to instantiate a client with different protocols.

Getting ready

You need a working ElasticSearch cluster and plugins for extra protocols. The full code of this recipe is in the chapter_11/client_creation.py file.

How to do it...

For creating a client, we need to perform the following steps:

Before using the Python client, it is required to install it (possibly in a Python virtual environment). The client is officially hosted on PyPi (http://pypi.python.org/) and it's easy to install with the following pip command:
```
pip install elasticsearch
```
This standard...

Managing indices

In the previous recipe we saw how to initialize a client to send calls to an ElasticSearch cluster. In this recipe, we will see how to manage indices via client calls.

Getting ready

You need a working ElasticSearch cluster and required packages of the Creating a client recipe of this chapter.

The full code of this recipe is in the chapter_11/indices_management.py file.

How to do it...

In Python, managing the lifecycle of your indices is very easy, we need to perform the following steps:

We initialize a client as follows:

import elasticsearch
es = elasticsearch.Elasticsearch()
index_name = "my_index"

All the indices methods are available in the client.indices namespace. We can create and wait for the creation of an index as follows:
```
es.indices.create(index_name)
es.cluster.health(wait_for_status="yellow")
```

We can close/open an index as follows:

es.indices.close(index_name)

es.indices.open(index_name)
es.cluster.health(wait_for_status="yellow")

We can optimize an index as follows:
```
es.indices...
```

Managing mappings

After creating an index, the next step is to add some mapping to it. We have already seen how to put a mapping via REST API in Chapter 4, Standard Operations. In this recipe, we will see how to manage mappings via official Python client and PyES.

Getting ready

You need a working ElasticSearch cluster and required packages of the Creating a client recipe of this chapter.

The code of this recipe is in chapter_11/mapping_management.py and chapter_11/mapping_management_pyes.py.

How to do it...

After having initialized a client and created an index, the steps required for managing the indices are as follows:

Create a mapping
Retrieve a mapping
Delete a mapping

These steps are easily managed with code as follows:

We initialize the client as follows:

import elasticsearch

es = elasticsearch.Elasticsearch()

We create an index as follows:

index_name = "my_index"
type_name = "my_type"
es.indices.create(index_name)
es.cluster.health(wait_for_status="yellow")

We put the mapping as follows:
```
es.indices...
```

Managing documents

The APIs for managing the documents (index, update, and delete) are the most important ones after the search ones. In this recipe, we will see how to use them in a standard way and in bulk actions to improve the performance.

Getting ready

You need a working ElasticSearch cluster and required packages of the Creating a client recipe of this chapter.

The full code of this recipe is in the chapter_11/document_management.py and chapter_11/document_management_pyes.py files.

How to do it...

The main operations to manage documents are as follows:

index: This stores a document in ElasticSearch. It is mapped on the Index API call.
update: This allows updating some values in a document. This operation is composed internally (via the Lucene nature) by deleting the previous document and reindexing of the document with the new values. It is mapped on the Update API call.
delete: This deletes a document from the index. It is mapped on the Delete API call.

With the ElasticSearch Python client...

Executing a standard search

After having inserted documents, the most common executed action in ElasticSearch is the search. The official ElasticSearch client APIs for searching are similar to the REST one.

Getting ready

You need a working ElasticSearch cluster and required packages of the Creating a client recipe of this chapter.

The code of this recipe is in the chapter_11/searching.py and chapter_11/searching_pyes.py files.

How to do it...

To execute a standard query, the client search method must be called passing the query parameters as we saw in Chapter 5, Search, Queries, and Filters. The required parameters are at least the index name, the type name, and the query DSL. In the following example I'll show how to call a match all query, a term query and a filter query. We need to perform the following steps:

We will initialize the client and populate the index as follows:

import elasticsearch
from pprint import pprint

es = elasticsearch.Elasticsearch()
index_name = "my_index"
type_name ...

Executing a facet search

Searching for results is obviously the main activity of a search engine, thus facet is very important because it often helps to complete the results.

Faceting is executed along the search doing analytics on searched results.

Getting ready

You need a working ElasticSearch cluster and required packages of the Creating a client recipe of this chapter.

The code of this recipe is in the chapter_11/faceting.py and chapter_11/faceting_pyes.py files.

How to do it...

To extend a query with the facet part, you need to define a facet section as we have already seen in Chapter 6, Facets. In the case of the official ElasticSearch client, you can add the facet DSL to the search dictionary to provide facets. We need to perform the following steps:

We need to initialize the client and populate the index as follows:

import elasticsearch
from pprint import pprint

es = elasticsearch.Elasticsearch()
index_name = "my_index"
type_name = "my_type"

from utils import create_and_add_mapping, populate...

The rest of the chapter is locked

You have been reading a chapter from

ElasticSearch Cookbook

Published in: Dec 2013Publisher: PacktISBN-13: 9781782166627

A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.

undefined

Unlock this book and the full library FREE for 7 days

Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of

Start free trial

Renews at $15.99/month. Cancel anytime

Author (1)

Alberto Paro

Alberto Paro is an engineer, manager, and software developer. He currently works as technology architecture delivery associate director of the Accenture Cloud First data and AI team in Italy. He loves to study emerging solutions and applications, mainly related to cloud and big data processing, NoSQL, Natural language processing (NLP), software development, and machine learning. In 2000, he graduated in computer science engineering from Politecnico di Milano. Then, he worked with many companies, mainly using Scala/Java and Python on knowledge management solutions and advanced data mining products, using state-of-the-art big data software. A lot of his time is spent teaching how to effectively use big data solutions, NoSQL data stores, and related technologies.
Read more about Alberto Paro

Personalised recommendations for you

Based on your interests and search pattern

Et al.

Ever wonder why speech recognition systems don't understand the Scottish accent, or what would happen if an astronaut only ate mac 'n' cheese, or other spurious reflections you'd have at a bar? We did, then collated those deliberations into absurd research articles with fake figures and methodologies inspired by even more fictionally absurd studies.

BookAug 2023230 pages5

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages4

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages5

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages1

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages5

Mastering Tableau 2023

This book is a comprehensive resource to mastering your Tableau skills and becoming a BI expert. As you progress, you will learn how to build advanced dashboards and improve your storytelling to derive key business insight, as well as make you well-versed with advanced functionalities of Tableau in the business intelligence domain.

BookAug 2023684 pages

Building AI Applications with ChatGPT APIs

This guide covers all ChatGPT API features for effortless creation of robust AI powered apps. With its help, you’ll be able to leverage ChatGPT’s cutting-edge NLP models to take your app development skills to the next level. You’ll also work on ten exciting projects that will give you the practical know-how that you can apply to your existing applications.

BookSep 2023258 pages5

Building AI Applications with ChatGPT APIs

This guide covers all ChatGPT API features for effortless creation of robust AI powered apps. With its help, you’ll be able to leverage ChatGPT’s cutting-edge NLP models to take your app development skills to the next level. You’ll also work on ten exciting projects that will give you the practical know-how that you can apply to your existing applications.

BookSep 2023258 pages2

Data Engineering with AWS

Embark on a journey to master data engineering pipelines on AWS! Our book offers a hands-on experience of AWS services for ingesting, transforming, and consuming data. Whether you're an absolute beginner or someone with basic data engineering experience, this guide is an indispensable resource.

BookOct 2023636 pages5

Modern Data Architecture on AWS

Every organization wants an agile, performant, and cost-effective data platform that meets all their current and future business needs. Purpose-built AWS analytics services and their features play a big part in building such a modern data platform. This book brings to you all the design and architectural patterns that’ll help you achieve this goal.

BookAug 2023420 pages5

Practical Guide to Applied Conformal Prediction in Python

Discover the power of Conformal Prediction with the "Practical Guide to Applied Conformal Prediction in Python." Master the latest techniques to quantify uncertainty in machine learning and computer vision models, and seamlessly apply them to your industry applications.

BookDec 2023240 pages

TinyML Cookbook

With over 70 project-based recipes, the TinyML Cookbook is a practical guide that will help you to get the most out of your microcontrollers. It provides a comprehensive understanding of the theoretical foundations while giving you hands-on experience training ML models for deployment on Arduino Nano 33 BLE Sense, Raspberry Pi Pico, and SparkFun RedBoard Artemis Nano microcontrollers.

BookNov 2023664 pages