Reader small image

You're reading from  Generative AI with LangChain

Product typeBook
Published inDec 2023
PublisherPackt
ISBN-139781835083468
Edition1st Edition
Right arrow
Author (1)
Ben Auffarth
Ben Auffarth
author image
Ben Auffarth

Ben Auffarth is a full-stack data scientist with more than 15 years of work experience. With a background and Ph.D. in computational and cognitive neuroscience, he has designed and conducted wet lab experiments on cell cultures, analyzed experiments with terabytes of data, run brain models on IBM supercomputers with up to 64k cores, built production systems processing hundreds and thousands of transactions per day, and trained language models on a large corpus of text documents. He co-founded and is the former president of Data Science Speakers, London.
Read more about Ben Auffarth

Right arrow

Building a Chatbot like ChatGPT

Chatbots powered by LLMs have demonstrated impressive fluency in conversational tasks like customer service. However, their lack of world knowledge limits their usefulness for domain-specific question answering. In this chapter, we explore how to overcome these limitations through Retrieval-Augmented Generation (RAG). RAG enhances chatbots by grounding their responses in external evidence sources, leading to more accurate and informative answers. This is achieved by retrieving relevant passages from corpora to condition the language model’s generation process. The key steps involve encoding corpora into vector embeddings to enable rapid semantic search and integrating retrieval results into the chatbot’s prompt.

We will also provide foundations for representing documents as vectors, indexing methods for efficient similarity lookups, and vector databases for managing embeddings. Building on these core techniques, we will demonstrate...

What is a chatbot?

Chatbots are AI programs that simulate conversational interactions with users via text or voice. Early chatbots, like ELIZA (1966) and PARRY (1972), used pattern matching. Recent advances, like LLMs, allow more natural conversations, as seen in systems like ChatGPT (2022). However, challenges remain in achieving human-level discourse.

The Turing test, proposed in 1950, established a landmark for assessing intelligence by a computer’s ability to impersonate human conversation. Despite limitations, it established a philosophical foundation for AI. However, early systems like ELIZA passed the test using scripted responses without true understanding, calling into question the test’s validity as an evaluation of AI. The test also faced criticism for relying on deceit and for limitations in its format that constrained the complexity of questioning. Philosophers like John Searle argued symbolic manipulation alone did not equate to human-level intelligence...

Understanding retrieval and vectors

Retrieval-augmented generation (RAG) is a technique that enhances text generation by retrieving and incorporating external knowledge. This grounds the output in factual information rather than relying solely on the knowledge that is encoded in the language model’s parameters. Retrieval-Augmented Language Models (RALMs) specifically refer to retrieval-augmented language models that integrate retrieval into the training and inference process.

Traditional language models generate text autoregressively based only on the prompt. RALMs augment this by first retrieving relevant context from external corpora using semantic search algorithms. Semantic search typically involves indexing documents into vector embeddings, allowing fast similarity lookups via approximate nearest neighbor search.

The retrieved evidence then conditions the language model to produce more accurate, contextually relevant text. This cycle repeats, with RALMs formulating...

Loading and retrieving in LangChain

LangChain implements a toolchain of different building blocks for building retrieval systems. In this section, we’ll look at how we can put them together in a pipeline for building a chatbot with RAG. This includes data loaders, document transformers, embedding models, vector stores, and retrievers.

The relationship between them is illustrated in the diagram here (source: LangChain documentation):

Figure 5.5: Vector stores and data loaders

In LangChain, we first load documents through data loaders. Then we can transform them and pass these documents to a vector store as embedding. We can then query the vector store or a retriever associated with the vector store. Retrievers in LangChain can wrap the loading and vector storage into a single step. We’ll mostly skip transformations in this chapter; however, you’ll find explanations with examples of data loaders, embeddings, storage mechanisms, and retrievers.

...

Implementing a chatbot

We’ll implement a chatbot now. We’ll assume you have the environment installed with the necessary libraries and the API keys as per the instructions in Chapter 3, Getting Started with LangChain.

To implement a simple chatbot in LangChain, you can follow this recipe:

  1. Set up a document loader.
  2. Store documents in a vector store.
  3. Set up a chatbot with retrieval from the vector storage.

We’ll generalize this with several formats and make this available through an interface in a web browser through Streamlit. You’ll be able to drop in your document and start asking questions. In production, for a corporate deployment for customer engagement, you can imagine that these documents are already loaded in, and your vector storage can just be static.

Let’s start with the document loader.

Document loader

As mentioned, we want to be able to read different formats:

from typing import Any...

Moderating responses

The role of moderation in chatbots is to ensure that the bot’s responses and conversations are appropriate, ethical, and respectful. It involves implementing mechanisms to filter out offensive or inappropriate content and discouraging abusive behavior from users. This is an important part of any application that we’d want to deploy for customers.

In the context of moderation, a constitution refers to a set of guidelines or rules that govern the behavior and responses of the chatbot. It outlines the standards and principles that the chatbot should adhere to, such as avoiding offensive language, promoting respectful interactions, and maintaining ethical standards. The constitution serves as a framework for ensuring that the chatbot operates within the desired boundaries and provides a positive user experience.

Moderation and having a constitution are important in chatbots for several reasons:

  • Ensuring ethical behavior: Chatbots can...

Summary

In the previous chapter, we discussed tool-augmented LLMs, which involve the utilization of external tools or knowledge resources such as document corpora. In this chapter, we focused on retrieving relevant data from sources through vector search and injecting it into the context. This retrieved data serves as additional information to augment the prompts given to LLMs. I also introduced retrieval and vector mechanisms, and we discussed implementing a chatbot, the importance of memory mechanisms, and the importance of appropriate responses.

The chapter started with an overview of chatbots, their evolution, and the current state of chatbots, highlighting the practical implications and enhancements of the capabilities of the current technology. We discussed the importance of proactive communication. We explored retrieval mechanisms, including vector storage, with the goal of improving the accuracy of chatbot responses. We went into detail on methods for loading documents...

Questions

Please see if you can produce the answers to these questions from memory. I’d recommend you go back to the corresponding sections of this chapter if you are unsure about any of them:

  1. Please name 5 different chatbots!
  2. What are some important aspects of developing a chatbot?
  3. What does RAG stand for?
  4. What is an embedding?
  5. What is vector search?
  6. What is a vector database?
  7. Please name 5 different vector databases!
  8. What is a retriever in LangChain?
  9. What is memory and what are the memory options in LangChain?
  10. What is moderation, what’s a constitution, and how do they work?

Join our community on Discord

Join our community’s Discord space for discussions with the authors and other readers:

https://packt.link/lang

lock icon
The rest of the chapter is locked
You have been reading a chapter from
Generative AI with LangChain
Published in: Dec 2023Publisher: PacktISBN-13: 9781835083468
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
undefined
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $15.99/month. Cancel anytime

Author (1)

author image
Ben Auffarth

Ben Auffarth is a full-stack data scientist with more than 15 years of work experience. With a background and Ph.D. in computational and cognitive neuroscience, he has designed and conducted wet lab experiments on cell cultures, analyzed experiments with terabytes of data, run brain models on IBM supercomputers with up to 64k cores, built production systems processing hundreds and thousands of transactions per day, and trained language models on a large corpus of text documents. He co-founded and is the former president of Data Science Speakers, London.
Read more about Ben Auffarth