Search icon
Arrow left icon
All Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Newsletters
Free Learning
Arrow right icon
Neural Search - From Prototype to Production with Jina

You're reading from  Neural Search - From Prototype to Production with Jina

Product type Book
Published in Oct 2022
Publisher Packt
ISBN-13 9781801816823
Pages 188 pages
Edition 1st Edition
Languages
Authors (6):
Jina AI Jina AI
Profile icon Jina AI
Bo Wang Bo Wang
Profile icon Bo Wang
Cristian Mitroi Cristian Mitroi
Profile icon Cristian Mitroi
Feng Wang Feng Wang
Profile icon Feng Wang
Shubham Saboo Shubham Saboo
Profile icon Shubham Saboo
Susana Guzmán Susana Guzmán
Profile icon Susana Guzmán
View More author details

Table of Contents (13) Chapters

Preface Part 1: Introduction to Neural Search Fundamentals
Chapter 1: Neural Networks for Neural Search Chapter 2: Introducing Foundations of Vector Representation Chapter 3: System Design and Engineering Challenges Part 2: Introduction to Jina Fundamentals
Chapter 4: Learning Jina’s Basics Chapter 5: Multiple Search Modalities Part 3: How to Use Jina for Neural Search
Chapter 6: Building Practical Examples with Jina Chapter 7: Exploring Advanced Use Cases of Jina Index Other Books You May Enjoy

Exploring Advanced Use Cases of Jina

In this chapter, we discuss more advanced applications of the Jina neural search framework. Building on the concepts we have learned in the previous chapters, we will now look at what else we can achieve with Jina. We will examine multi-level granularity matches, querying while indexing, and a cross-modal example. These are challenging concepts in neural search and are required to achieve complex real-life applications. In particular, we will be covering these topics in this chapter:

  • Introducing multi-level granularity
  • Cross-modal search with images with text
  • Concurrent querying and indexing data

These cover a wide variety of real-life requirements of neural search applications. Using these examples, together with the basic examples in Chapter 6, Basic Practical Examples with Jina, you can expand and improve your Jina applications to cover even more advanced usage patterns.

Technical requirements

In this chapter, we will build and execute the advanced examples provided in the GitHub repository. The code is available at https://github.com/PacktPublishing/Neural-Search-From-Prototype-to-Production-with-Jina/tree/main/src/Chapter07. Make sure to download this and navigate to each of the examples’ respective folders when following the instructions for how to reproduce the use cases.

To run this code, you will need the following:

  • macOS, Linux, or Windows with WSL2 installed. Jina does not run on native Windows.
  • Python 3.7 or 3.8
  • Optionally, a clean new virtual environment for each of the examples
  • Docker

Introducing multi-level granularity

In this section, we will discuss how Jina can capture and leverage the hierarchical structure of real-life data. In order to follow along with the existing code, check the chapter’s code for a folder named multires-lyrics-search. This is the example we will be referring to in this section.

This example relies on the Document type’s capacity to hold chunks (child documents) and refer to a specific parent. Using this structure, you can compose advanced arbitrary level hierarchies of documents within documents. This mimics various real-life data-related problems. Examples could be patches of images, sentences of a paragraph, video clips of a longer movie, and so on.

See the following code for how to perform this with Jina’s Document API:

from jina import Document
 document = Document() 
chunk1 = Document(text='this is the first chunk') 
chunk2 = Document(text='this is the second chunk') 
document.chunks...

Cross-modal search with images with text

In this section, we will cover an advanced example showcasing cross-modal search. Cross-modal search is a subtype of neural search, where the data we index and the data we search with belong to different modalities. This is something that is unique to neural search, as none of the traditional search technologies could easily achieve this. This is possible due to the central neural search technology: all deep learning models fundamentally transform all data types to the same shared numeric representation of a vector (the embedding extracted from a specific layer of the network).

These modalities can be represented by different data types: audio, text, video, and images. At the same time, they can also be of the same type, but of different distributions. An example of this could be searching with a paper summary and wanting to get the paper title. They are both texts, but the underlying data distribution is different. The distribution is thus...

Concurrent querying and indexing data

In this section, we will present the methodology for how to continuously serve your client’s requests while still being able to update, delete, or add new data to your database. This is a common requirement in the industry, but it is not trivial to achieve. The challenges here are around maintaining the vector index actualized with the most recent data, while also being able to update that data in an atomic manner, but also doing all these operations in a scalable, containerized environment. With the Jina framework, all of these challenges can be easily met and overcome.

By default, in a Jina Flow, you cannot both index data and search at the same time. This is due to the nature of the network protocol. In essence, each Executor is a single-threaded application. You can use sharding to extend the number of copies of an Executor that form an Executor group. However, this is only safe for purely parallel operations, such as encoding data...

Summary

In this chapter, we have analyzed and practiced how you can use Jina’s advanced features, such as chunking, modality, and the advanced HNSWPostgreSQL Executor, in order to tackle the most difficult goals of neural search. We implemented solutions for arbitrary hierarchical depth data representation, cross-modality searching, and non-blocking data updates. Chunking allowed us to reflect on some data’s properties of having multiple levels of semantic meaning, such as sentences in a paragraph or video clips in longer films. Cross-modal searching opens up one of the main advantages of neural search – its data universality. This means that you can search with any data for any type of data, as long as you use the correct model for the data type. Finally, the HNSWPostgreSQL Executor allows us to build a live system where users can both search and index at the same time, with the data being kept in sync.

lock icon The rest of the chapter is locked
You have been reading a chapter from
Neural Search - From Prototype to Production with Jina
Published in: Oct 2022 Publisher: Packt ISBN-13: 9781801816823
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $15.99/month. Cancel anytime}