Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Save more on your purchases! discount-offer-chevron-icon
Savings automatically calculated. No voucher code required.
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Newsletter Hub
Free Learning
Arrow right icon
timer SALE ENDS IN
0 Days
:
00 Hours
:
00 Minutes
:
00 Seconds

How-To Tutorials - LLM

81 Articles
article-image-spark-and-langchain-for-data-analysis
Alan Bernardo Palacio
31 Aug 2023
12 min read
Save for later

Spark and LangChain for Data Analysis

Alan Bernardo Palacio
31 Aug 2023
12 min read
IntroductionIn today's data-driven world, the demand for extracting insights from large datasets has led to the development of powerful tools and libraries. Apache Spark, a fast and general-purpose cluster computing system, has revolutionized big data processing. Coupled with LangChain, a cutting-edge library built atop advanced language models, you can now seamlessly combine the analytical capabilities of Spark with the natural language interaction facilitated by LangChain. This article introduces Spark, explores the features of LangChain, and provides practical examples of using Spark with LangChain for data analysis.Understanding Apache SparkThe processing and analysis of large datasets have become crucial for organizations and individuals alike. Apache Spark has emerged as a powerful framework that revolutionizes the way we handle big data. Spark is designed for speed, ease of use, and sophisticated analytics. It provides a unified platform for various data processing tasks, such as batch processing, interactive querying, machine learning, and real-time stream processing.At its core, Apache Spark is an open-source, distributed computing system that excels at processing and analyzing large datasets in parallel. Unlike traditional MapReduce systems, Spark introduces the concept of Resilient Distributed Datasets (RDDs), which are immutable distributed collections of data. RDDs can be transformed and operated upon using a wide range of high-level APIs provided by Spark, making it possible to perform complex data manipulations with ease.Key Components of SparkSpark consists of several components that contribute to its versatility and efficiency:Spark Core: The foundation of Spark, responsible for tasks such as task scheduling, memory management, and fault recovery. It also provides APIs for creating and manipulating RDDs.Spark SQL: A module that allows Spark to work seamlessly with structured data using SQL-like queries. It enables users to interact with structured data through the familiar SQL language.Spark Streaming: Enables real-time stream processing, making it possible to process and analyze data in near real-time as it arrives in the system.MLlib (Machine Learning Library): A scalable machine learning library built on top of Spark, offering a wide range of machine learning algorithms and tools.GraphX: A graph processing library that provides abstractions for efficiently manipulating graph-structured data.Spark DataFrame: A higher-level abstraction on top of RDDs, providing a structured and more optimized way to work with data. DataFrames offer optimization opportunities, enabling Spark's Catalyst optimizer to perform query optimization and code generation.Spark's distributed computing architecture enables it to achieve high performance and scalability. It employs a master/worker architecture where a central driver program coordinates tasks across multiple worker nodes. Data is distributed across these nodes, and tasks are executed in parallel on the distributed data.We will be diving into two different types of interaction with Spark, SparkSQL, and Spark Data Frame. Apache Spark is a distributed computing framework with Spark SQL as one of its modules for structured data processing. Spark DataFrame is a distributed collection of data organized into named columns, offering a programming abstraction similar to data frames in R or Python but optimized for distributed processing. It provides a functional programming API, allowing operations like select(), filter(), and groupBy(). On the other hand, Spark SQL allows users to run unmodified SQL queries on Spark data, integrating seamlessly with DataFrames and offering a bridge to BI tools through JDBC/ODBC.Both Spark DataFrame and Spark SQL leverage the Catalyst optimizer for efficient query execution. While DataFrames are preferred for programmatic APIs and functional capabilities, Spark SQL is ideal for ad-hoc querying and users familiar with SQL. The choice between them often hinges on the specific use case and the user's familiarity with either SQL or functional programming.In the next sections, we will explore how LangChain complements Spark's capabilities by introducing natural language interactions through agents.Introducing Spark Agent to LangChainLangChain, a dynamic library built upon the foundations of modern Language Model (LLM) technologies, is a pivotal addition to the world of data analysis. It bridges the gap between the power of Spark and the ease of human language interaction.LangChain harnesses the capabilities of advanced LLMs like ChatGPT and HuggingFace-hosted Models. These language models have proven their prowess in understanding and generating human-like text. LangChain capitalizes on this potential to enable users to interact with data and code through natural language queries.Empowering Data AnalysisThe introduction of the Spark Agent to LangChain brings about a transformative shift in data analysis workflows. Users are now able to tap into the immense analytical capabilities of Spark through simple daily language. This innovation opens doors for professionals from various domains to seamlessly explore datasets, uncover insights, and derive value without the need for deep technical expertise.LangChain acts as a bridge, connecting the technical realm of data processing with the non-technical world of language understanding. It empowers individuals who may not be well-versed in coding or data manipulation to engage with data-driven tasks effectively. This accessibility democratizes data analysis and makes it inclusive for a broader audience.The integration of LangChain with Spark involves a thoughtful orchestration of components that work in harmony to bring human-language interaction to the world of data analysis. At the heart of this integration lies the collaboration between ChatGPT, a sophisticated language model, and PythonREPL, a Python Read-Evaluate-Print Loop. The workflow is as follows:ChatGPT receives user queries in natural language and generates a Python command as a solution.The generated Python command is sent to PythonREPL for execution.PythonREPL executes the command and produces a result.ChatGPT takes the result from PythonREPL and translates it into a final answer in natural language.This collaborative process can repeat multiple times, allowing users to engage in iterative conversations and deep dives into data analysis.Several keynotes ensure a seamless interaction between the language model and the code execution environment:Initial Prompt Setup: The initial prompt given to ChatGPT defines its behavior and available tooling. This prompt guides ChatGPT on the desired actions and toolkits to employ.Connection between ChatGPT and PythonREPL: Through predefined prompts, the format of the answer is established. Regular expressions (regex) are used to extract the specific command to execute from ChatGPT's response. This establishes a clear flow of communication between ChatGPT and PythonREPL.Memory and Conversation History: ChatGPT does not possess a memory of past interactions. As a result, maintaining the conversation history locally and passing it with each new question is essential to maintaining context and coherence in the interaction.In the upcoming sections, we'll explore practical use cases that illustrate how this integration manifests in the real world, including interactions with Spark SQL and Spark DataFrames.The Spark SQL AgentIn this section, we will walk you through how to interact with Spark SQL using natural language, unleashing the power of Spark for querying structured data.Let's walk through a few hands-on examples to illustrate the capabilities of the integration:Exploring Data with Spark SQL Agent:Querying the dataset to understand its structure and metadata.Calculating statistical metrics like average age and fare.Extracting specific information, such as the name of the oldest survivor.Analyzing Dataframes with Spark DataFrame Agent:Counting rows to understand the dataset size.Analyzing the distribution of passengers with siblings.Computing descriptive statistics like the square root of average age.By interacting with the agents and experimenting with natural language queries, you'll witness firsthand the seamless fusion of advanced data processing with user-friendly language interactions. These examples demonstrate how Spark and LangChain can amplify your data analysis efforts, making insights more accessible and actionable.Before diving into the magic of Spark SQL interactions, let's set up the necessary environment. We'll utilize LangChain's SparkSQLToolkit to seamlessly bridge between Spark and natural language interactions. First, make sure you have your API key for OpenAI ready. You'll need it to integrate the language model.from langchain.agents import create_spark_sql_agent from langchain.agents.agent_toolkits import SparkSQLToolkit from langchain.chat_models import ChatOpenAI from langchain.utilities.spark_sql import SparkSQL import os # Set up environment variables for API keys os.environ['OPENAI_API_KEY'] = 'your-key'Now, let's get hands-on with Spark SQL. We'll work with a Titanic dataset, but you can replace it with your own data. First, create a Spark session, define a schema for the database, and load your data into a Spark DataFrame. We'll then create a table in Spark SQL to enable querying.from pyspark.sql import SparkSession spark = SparkSession.builder.getOrCreate() schema = "langchain_example" spark.sql(f"CREATE DATABASE IF NOT EXISTS {schema}") spark.sql(f"USE {schema}") csv_file_path = "titanic.csv" table = "titanic" spark.read.csv(csv_file_path, header=True, inferSchema=True).write.saveAsTable(table) spark.table(table).show() Now, let's initialize the Spark SQL Agent. This agent acts as your interactive companion, enabling you to query Spark SQL tables using natural language. We'll create a toolkit that connects LangChain, the SparkSQL instance, and the chosen language model (in this case, ChatOpenAI).from langchain.agents import AgentType spark_sql = SparkSQL(schema=schema) llm = ChatOpenAI(temperature=0, model="gpt-4-0613") toolkit = SparkSQLToolkit(db=spark_sql, llm=llm, handle_parsing_errors="Check your output and make sure it conforms!") agent_executor = create_spark_sql_agent(    llm=llm,    toolkit=toolkit,    agent=AgentType.CHAT_ZERO_SHOT_REACT_DESCRIPTION,    verbose=True,    handle_parsing_errors=True)Now comes the exciting part—querying Spark SQL tables using natural language! With your Spark SQL Agent ready, you can ask questions about your data and receive insightful answers. Let's try a few examples:# Describe the Titanic table agent_executor.run("Describe the titanic table") # Calculate the square root of the average age agent_executor.run("whats the square root of the average age?") # Find the name of the oldest survived passenger agent_executor.run("What's the name of the oldest survived passenger?") With these simple commands, you've tapped into the power of Spark SQL using natural language. The Spark SQL Agent makes data exploration and querying more intuitive and accessible than ever before.The Spark DataFrame AgentIn this section, we'll dive into another facet of LangChain's integration with Spark—the Spark DataFrame Agent. This agent leverages the power of Spark DataFrames and natural language interactions to provide an engaging and insightful way to analyze data.Before we begin, make sure you have a Spark session set up and your data loaded into a DataFrame. For this example, we'll use the Titanic dataset. Replace csv_file_path with the path to your own data if needed.from langchain.llms import OpenAI from pyspark.sql import SparkSession from langchain.agents import create_spark_dataframe_agent spark = SparkSession.builder.getOrCreate() csv_file_path = "titanic.csv" df = spark.read.csv(csv_file_path, header=True, inferSchema=True) df.show()Initializing the Spark DataFrame AgentNow, let's unleash the power of the Spark DataFrame Agent! This agent allows you to interact with Spark DataFrames using natural language queries. We'll initialize the agent by specifying the language model and the DataFrame you want to work with.# Initialize the Spark DataFrame Agent agent = create_spark_dataframe_agent(llm=OpenAI(temperature=0), df=df, verbose=True)With the agent ready, you can explore your data using natural language queries. Let's dive into a few examples:# Count the number of rows in the DataFrame agent.run("how many rows are there?") # Find the number of people with more than 3 siblings agent.run("how many people have more than 3 siblings") # Calculate the square root of the average age agent.run("whats the square root of the average age?")Remember that the Spark DataFrame Agent under the hood uses generated Python code to interact with Spark. While it's a powerful tool for interactive analysis, ensures that the generated code is safe to execute, especially in a sensitive environment.In this final section, let's tie everything together and showcase how Spark and LangChain work in harmony to unlock insights from data. We've covered the Spark SQL Agent and the Spark DataFrame Agent, so now it's time to put theory into practice.In conclusion, the combination of Spark and LangChain transcends the traditional boundaries of technical expertise, enabling data enthusiasts of all backgrounds to engage with data-driven tasks effectively. Through the Spark SQL Agent and Spark DataFrame Agent, LangChain empowers users to interact, explore, and analyze data using the simplicity and familiarity of natural language. So why wait? Dive in and unlock the full potential of your data analysis journey with the synergy of Spark and LangChain.ConclusionIn this article, we've delved into the world of Apache Spark and LangChain, two technologies that synergize to transform how we interact with and analyze data. By bridging the gap between technical data processing and human language understanding, Spark and LangChain enable users to derive meaningful insights from complex datasets through simple, natural language queries. The Spark SQL Agent and Spark DataFrame Agent presented here demonstrate the potential of this integration, making data analysis more accessible to a wider audience. As both technologies continue to evolve, we can expect even more powerful capabilities for unlocking the true potential of data-driven decision-making. So, whether you're a data scientist, analyst, or curious learner, harnessing the power of Spark and LangChain opens up a world of possibilities for exploring and understanding data in an intuitive and efficient manner.Author BioAlan Bernardo Palacio is a data scientist and an engineer with vast experience in different engineering fields. His focus has been the development and application of state-of-the-art data products and algorithms in several industries. He has worked for companies such as Ernst and Young, and Globant, and now holds a data engineer position at Ebiquity Media helping the company to create a scalable data pipeline. Alan graduated with a Mechanical Engineering degree from the National University of Tucuman in 2015, participated as the founder of startups, and later on earned a Master's degree from the faculty of Mathematics at the Autonomous University of Barcelona in 2017. Originally from Argentina, he now works and resides in the Netherlands.LinkedIn 
Read more
  • 0
  • 0
  • 15281

article-image-building-a-containerized-llm-chatbot-application
Alan Bernardo Palacio
21 Aug 2023
19 min read
Save for later

Building a Containerized LLM Chatbot Application

Alan Bernardo Palacio
21 Aug 2023
19 min read
In this hands-on tutorial, we will build a containerized LLM-powered chatbot application that uses examples to create a custom chatbot capable of answering deep philosophical questions and responding with profound questions in return. We will use Streamlit as the web application framework, PostgreSQL as the database to store examples, and OpenAI's GPT-3.5 "text-davinci-003" model for language processing.The application allows users to input philosophical questions, and the AI-powered chatbot will respond with insightful answers based on the provided examples. Additionally, the chatbot will ask thought-provoking questions in response to user input, simulating the behavior of philosophical minds like Socrates and Nietzsche.We'll break down the implementation into several files, each serving a specific purpose:Dockerfile: This file defines the Docker image for our application, specifying the required dependencies and configurations.docker-compose.yml: This file orchestrates the Docker containers for our application, including the web application (Streamlit) and the PostgreSQL database.setup.sql: This file contains the SQL commands to set up the PostgreSQL database and insert example data.streamlit_app.py: This file defines the Streamlit web application and its user interface.utils.py: This file contains utility functions to interact with the database, create the Da Vinci LLM model, and generate responses.requirements.txt: This file lists the Python dependencies required for our application.The DockerfileThe Dockerfile is used to build the Docker image for our application. It specifies the base image, sets up the working directory, installs the required dependencies, and defines the command to run the Streamlit application:FROM python:3 WORKDIR /app COPY requirements.txt . RUN pip install --no-cache-dir -r requirements.txt COPY . . CMD ["streamlit", "run", "streamlit_app.py"]In the Dockerfile, we define the base image to Python 3 using FROM python:3, which enables us to use Python and its packages. Next, we specify the working directory inside the container as /app where we will copy our application files. To ensure all required Python packages are installed, we copy the requirements.txt file, which lists the dependencies, into the container's and then, we run the command pip install --no-cache-dir -r requirements.txt to install the Python dependencies. We proceed to copy all the files from the current directory (containing our application files) into the container's /app directory using COPY . .. Finally, we define the command to run the Streamlit application when the container starts using CMD ["streamlit", "run", "streamlit_app.py"]. This command starts the Streamlit app, enabling users to interact with the philosophical AI assistant through their web browsers once the container is up and running.The requirements.txt file lists the Python dependencies required for our application:streamlit streamlit-chat streamlit-extras psycopg2-binary openai==0.27.8 langchain==0.0.225The requirement file uses the next packages:streamlit: The Streamlit library for creating web applications.streamlit-chat: Streamlit Chat library for adding chat interfaces to Streamlit apps.streamlit-extras: Streamlit Extras library for adding custom components to Streamlit apps.psycopg2-binary: PostgreSQL adapter for Python.openai==0.27.8: The OpenAI Python library for accessing the GPT-3.5 model.langchain==0.0.225: LangChain library for working with language models and prompts.Next, we will define the docker compose file which will also handle the deployment of the Postgres database where we will store our examples.Creating the docker-composeThe docker-compose.yml file orchestrates the Docker containers for our application: the Streamlit web application and the PostgreSQL database:version: '3' services: app:    build:      context: ./app    ports:      - 8501:8501    environment:      - OPENAI_API_KEY=${OPENAI_API_KEY}    depends_on:      - db db:    image: postgres:13    environment:      - POSTGRES_USER=your_username      - POSTGRES_PASSWORD=your_password      - POSTGRES_DB=chatbot_db      - POSTGRES_HOST_AUTH_METHOD=trust    volumes:      - ./db/setup.sql:/docker-entrypoint-initdb.d/setup.sqlThe docker-compose.yml file orchestrates the deployment of our LLM-powered chatbot applicationand defines the services, i.e., the containers, needed for our application.In the services section, we have two distinct services defined: app and db. The app service corresponds to our Streamlit web application, which will serve as the user interface for interacting with the philosophical AI assistant. To build the Docker image for this service, we specify the build context as ./app, where the necessary application files, including the Dockerfile, reside.To ensure seamless communication between the host machine and the app container, we use the ports option to map port 8501 from the host to the corresponding port inside the container. This allows users to access the web application through their web browsers.For the application to function effectively, the environment variable OPENAI_API_KEY must be set, providing the necessary authentication for our LLM model to operate. This is done using the environment section, where we define this variable.One of the critical components of our application is the integration of a PostgreSQL database to store the philosophical question-answer pairs. The db service sets up the PostgreSQL database using the postgres:13 image. We configure the required environment variables, such as the username, password, and database name, to establish the necessary connection.To initialize the database with our predefined examples, we leverage the volumes option to mount the setup.sql file from the host machine into the container's /docker-entrypoint-initdb.d directory. This SQL script contains the commands to create the examples table and insert the example data. By doing so, our PostgreSQL database is ready to handle the profound philosophical interactions with the AI assistant.In conclusion, the docker-compose.yml file provides a streamlined and efficient way to manage the deployment and integration of Language Model Microservices with a PostgreSQL database, creating a cohesive environment for our philosophical AI assistant application.Setting up examplesThe setup.sql file contains the SQL commands to set up the PostgreSQL database and insert example data. We use this file in the volumes section of the docker-compose.yml file to initialize the database when the container starts:-- Create the examples table CREATE TABLE IF NOT EXISTS examples ( id SERIAL PRIMARY KEY, query TEXT, answer TEXT ); -- Insert the examples INSERT INTO examples (query, answer) VALUES ('What is the nature of truth?', 'Truth is a mirror reflecting the depths of our souls.'), ('Is there an objective reality?', 'Reality is an ever-shifting kaleidoscope, molded by our perceptions.'), (' What is the role of reason in human understanding?', 'Reason illuminates the path of knowledge, guiding us towards self-awareness.'), ('What is the nature of good and evil?', 'Good and evil are intertwined forces, dancing in the eternal cosmic tango.'), ('Is there a purpose to suffering?', 'Suffering unveils the canvas of resilience, painting a masterpiece of human spirit.'), ('What is the significance of morality?', 'Morality is the compass that navigates the vast ocean of human conscience.'), ('What is the essence of human existence?', 'Human existence is a riddle wrapped in the enigma of consciousness.'), ('How can we find meaning in a chaotic world?', 'Meaning sprouts from the fertile soil of introspection, blooming in the garden of wisdom.'), ('What is the nature of love and its transformative power?', 'Love is an alchemist, transmuting the mundane into the divine.'), ('What is the relationship between individuality and society?', 'Individuality dances in the grand symphony of society, playing a unique melody of self-expression.'), ('What is the pursuit of knowledge and its impact on the human journey?', 'Knowledge is the guiding star, illuminating the path of human evolution.'), ('What is the essence of human freedom?', 'Freedom is the soaring eagle, embracing the vast expanse of human potential.');The setup.sql script plays a crucial role in setting up the PostgreSQL database for our LLM-powered chatbot application. The SQL commands within this script are responsible for creating the examples table with the necessary columns and adding the example data to this table.In the context of our LLM application, these examples are of great importance as they serve as the foundation for the assistant's responses. The examples table could be a collection of question-answer pairs that the AI assistant has learned from past interactions. Each row in the table represents a specific question (query) and its corresponding insightful answer (answer).When a user interacts with the chatbot and enters a new question, the application leverages these examples to create a custom prompt for the LLM model. By selecting a relevant example based on the length of the user's question, the application constructs a few-shot prompt that incorporates both the user's query and an example from the database.The LLM model uses this customized prompt, containing the user's input and relevant examples, to generate a thoughtful and profound response that aligns with the philosophical nature of the AI assistant. The inclusion of examples in the prompt ensures that the chatbot's responses resonate with the same level of wisdom and depth found in the example interactions stored in the database.By learning from past examples and incorporating them into the prompts, our LLM-powered chatbot can emulate the thought processes of philosophical giants like Socrates and Nietzsche. Ultimately, these examples become the building blocks that empower the AI assistant to engage in the profound realms of philosophical discourse with the users.The Streamlit ApplicationThe streamlit_app.py file defines the Streamlit web application and its user interface. It is the main file where we build the web app and interact with the LLM model:import streamlit as st from streamlit_chat import message from streamlit_extras.colored_header import colored_header from streamlit_extras.add_vertical_space import add_vertical_space from utils import * # Define database credentials here DB_HOST = "db" DB_PORT = 5432 DB_NAME = "chatbot_db" DB_USER = "your_username" DB_PASSWORD = "your_password" # Connect to the PostgreSQL database and retrieve examples examples = get_database_examples(DB_HOST, DB_PORT, DB_NAME, DB_USER, DB_PASSWORD) # Create the Da Vinci LLM model davinci = create_davinci_model() # Create the example selector and few shot prompt template example_selector = create_example_selector(examples) dynamic_prompt_template = create_few_shot_prompt_template(example_selector) # Now the Streamlit app # Sidebar contents with st.sidebar:    st.title('The AI seeker of truth and wisdom')    st.markdown('''    ## About    This app is an LLM-powered chatbot built using:    - Streamlit    - Open AI Davinci LLM Model    - LangChain    - Philosophy    ''')    add_vertical_space(5)    st.write('Running in Docker!') # Generate empty lists for generated and past. ## generated stores AI generated responses if 'generated' not in st.session_state:    st.session_state['generated'] = ["Hi, what questions do you have today?"] ## past stores User's questions if 'past' not in st.session_state:    st.session_state['past'] = ['Hi!'] # Layout of input/response containers input_container = st.container() colored_header(label='', description='', color_name='blue-30') response_container = st.container() # User input ## Function for taking user provided prompt as input def get_text():    input_text = st.text_input("You: ", "", key="input")    return input_text ## Applying the user input box with input_container:    user_input = get_text() # Response output ## Function for taking user prompt as input followed by producing AI generated responses def generate_response(prompt):    response = davinci(        dynamic_prompt_template.format(query=prompt)    )    return response ## Conditional display of AI generated responses as a function of user provided prompts with response_container:    if user_input:        response = generate_response(user_input)        st.session_state.past.append(user_input)       st.session_state.generated.append(response)    if st.session_state['generated']:        for i in range(len(st.session_state['generated'])):            message(st.session_state['past'][i], is_user=True, key=str(i) + '_user',avatar_style='identicon',seed=123)            message(st.session_state["generated"][i], key=str(i),avatar_style='icons',seed=123)In this part of the code, we set up the core components of our LLM-powered chatbot application. We begin by importing the necessary libraries, including Streamlit, Streamlit Chat, and Streamlit Extras, along with utility functions from the utils.py file. Next, we define the database credentials (DB_HOST, DB_PORT, DB_NAME, DB_USER, DB_PASSWORD) required for connecting to the PostgreSQL database.The application then establishes a connection to the database using the get_database_examples function from the utils.py file. This crucial step retrieves profound philosophical question-answer pairs stored in the examples table. These examples are essential as they serve as a knowledge base for the AI assistant and provide the context and wisdom needed to generate meaningful responses.To leverage the OpenAI Da Vinci LLM model, we create the model instance using the create_davinci_model function from utils.py. This model acts as the core engine of our chatbot, enabling it to produce thoughtful and profound responses.In order to create custom prompts for the LLM model, we utilize the create_example_selector and create_few_shot_prompt_template functions from the utils.py file. These functions help select relevant examples based on the length of the user's input and construct dynamic prompts that combine the user's query with relevant examples.The Streamlit web app's sidebar is then set up, providing users with information about the application's purpose and inspiration. Within the application's session state, two lists (generated and past) are initialized to store AI-generated responses and user questions, respectively.To ensure an organized layout, we define two containers (input_container and response_container). The input_container houses the text input box where users can enter their questions. The get_text function is responsible for capturing the user's input.For generating AI responses, the generate_response function takes the user's prompt, processes it through the Da Vinci LLM model, and produces insightful replies. The AI-generated responses are displayed in the response_container using the message function from the Streamlit Chat library, allowing users to engage in profound philosophical dialogues with the AI assistant. Overall, this setup lays the groundwork for an intellectually stimulating and philosophical chatbot experience.Crating the utils fileThe utils.py file contains utility functions for our application, including connecting to the database, creating the Da Vinci LLM model, and generating responses:from langchain import PromptTemplate, FewShotPromptTemplate from langchain.prompts.example_selector import LengthBasedExampleSelector from langchain.llms import OpenAI from langchain import PromptTemplate, LLMChain from langchain.prompts.example_selector import LengthBasedExampleSelector from langchain import FewShotPromptTemplate import psycopg2 def get_database_examples(host, port, dbname, user, password):    try:        conn = psycopg2.connect(            host=host,            port=port,            dbname=dbname,            user=user,            password=password        )        cursor = conn.cursor()        cursor.execute("SELECT query, answer FROM examples")        rows = cursor.fetchall()        examples = [{"query": row[0], "answer": row[1]} for row in rows]        cursor.close()        conn.close()        return examples    except psycopg2.Error as e:        raise Exception(f"Error connecting to the database: {e}") def create_davinci_model():    return OpenAI(model_name='text-davinci-003') def create_example_selector(examples):    example_template = """    User: {query}    AI: {answer}    """    example_prompt = PromptTemplate(        input_variables=["query", "answer"],        template=example_template    )    if not examples:        raise Exception("No examples found in the database.")    return LengthBasedExampleSelector(        examples=examples,        example_prompt=example_prompt,        max_length=50    ) def create_few_shot_prompt_template(example_selector):    prefix = """The following are excerpts from conversations with a philosophical AI assistant.    The assistant is a seeker of truth and wisdom, responding with profound questions to know yourself    in a way that Socrates, Nietzsche, and other great minds would do. Here are some examples:"""    suffix = """    User: {query}    AI: """    return FewShotPromptTemplate(        example_selector=example_selector,        example_prompt=example_selector.example_prompt,        prefix=prefix,        suffix=suffix,        input_variables=["query"],        example_separator="\\\\n"    ) def generate_response(davinci, dynamic_prompt_template, prompt):    response = davinci(dynamic_prompt_template.format(query=prompt))    return responseThe get_database_examples function is responsible for establishing a connection to the PostgreSQL database using the provided credentials (host, port, dbname, user, password). Through this connection, the function executes a query to retrieve the question-answer pairs stored in the examples table. The function then organizes this data into a list of dictionaries, with each dictionary representing an example containing the query (question) and its corresponding answer.The create_davinci_model function is straightforward, as it initializes and returns the Da Vinci LLM model.To handle the selection of relevant examples for constructing dynamic prompts, the create_example_selector function plays a crucial role. It takes the list of examples as input and creates an example selector. This selector helps choose relevant examples based on the length of the user's query. By using this selector, the AI assistant can incorporate diverse examples that align with the user's input, leading to more coherent and contextually appropriate responses.The create_few_shot_prompt_template function is responsible for building the few-shot prompt template. This template includes a custom prefix and suffix to set the tone and style of the philosophical AI assistant. The prefix emphasizes the assistant's role as a "seeker of truth and wisdom" while the suffix provides the formatting for the user's query and AI-generated response. The custom template ensures that the AI assistant's interactions are profound and engaging, resembling the thought-provoking dialogues of historical philosophers like Socrates and Nietzsche.Finally, the generate_response function is designed to generate the AI's response based on the user's prompt. It takes the Da Vinci LLM model, dynamic prompt template, and the user's input as input parameters. The function uses the LLM model to process the dynamic prompt, blending the user's query with the selected examples, and returns the AI-generated response.Starting the applicationTo launch our philosophical AI assistant application with all its components integrated seamlessly, we can use Docker Compose. By executing the command docker-compose --env-file .env up, the Docker Compose tool will orchestrate the entire application deployment process.The --env-file .env option allows us to specify the environment variables from the .env file, which holds sensitive credentials and configuration details. This ensures that the necessary environment variables, such as the OpenAI API key and database credentials, are accessible to the application without being explicitly exposed in the codebase.When the docker-compose up command is initiated, Docker Compose will first build the application's Docker image using the Dockerfile defined in the ./app directory. This image will contain all the required dependencies and configurations for our Streamlit web application and the integration with the Da Vinci LLM model.Next, Docker Compose will create two services: the app service, which represents our Streamlit web application, and the db service, representing the PostgreSQL database. The app service is configured to run on port 8501, making it accessible through http://localhost:8501 in the browser.Once the services are up and running, the Streamlit web application will be fully operational, and users can interact with the philosophical AI assistant through the user-friendly interface. When a user enters a philosophical question, the application will use the Da Vinci LLM model, together with the selected examples, to generate insightful and profound responses in the style of great philosophers:With Docker Compose, our entire application, including the web server, LLM model, and database, will be containerized, enabling seamless deployment across different environments. This approach ensures that the application is easily scalable and portable, allowing users to experience the intellectual exchange with the philosophical AI assistant effortlessly.ConclusionIn this tutorial, we've built a containerized LLM-powered chatbot application capable of answering deep philosophical questions and responding with profound questions, inspired by philosophers like Socrates and Nietzsche. We used Streamlit as the web application framework, PostgreSQL as the database, and OpenAI's GPT-3.5 model for language processing.By combining Streamlit, PostgreSQL, and OpenAI's GPT-3.5 model, you've crafted an intellectually stimulating user experience. Your chatbot can answer philosophical inquiries with deep insights and thought-provoking questions, providing users with a unique and engaging interaction.Feel free to experiment further with the chatbot, add more examples to the database, or explore different prompts for the LLM model to enrich the user experience. As you continue to develop your AI assistant, remember the immense potential these technologies hold for solving real-world challenges and fostering intelligent conversations.Author Bio:Alan Bernardo Palacio is a data scientist and an engineer with vast experience in different engineering fields. His focus has been the development and application of state-of-the-art data products and algorithms in several industries. He has worked for companies such as Ernst and Young, Globant, and now holds a data engineer position at Ebiquity Media helping the company to create a scalable data pipeline. Alan graduated with a Mechanical Engineering degree from the National University of Tucuman in 2015, participated as the founder in startups, and later on earned a Master's degree from the faculty of Mathematics in the Autonomous University of Barcelona in 2017. Originally from Argentina, he now works and resides in the Netherlands.LinkedIn
Read more
  • 0
  • 0
  • 15111

article-image-question-answering-in-langchain
Mostafa Ibrahim
10 Oct 2023
8 min read
Save for later

Question Answering in LangChain

Mostafa Ibrahim
10 Oct 2023
8 min read
Dive deeper into the world of AI innovation and stay ahead of the AI curve! Subscribe to our AI_Distilled newsletter for the latest insights. Don't miss out – sign up today!IntroductionImagine seamlessly processing vast amounts of data, posing any question, and receiving eloquently crafted answers in return. While Large Language Models like ChatGPT excel with general data, they falter when it comes to your private information—data you'd rather not broadcast to the world. Enter LangChain: it empowers us to harness any NLP model, refining it with our exclusive data.In this article, we'll explore LangChain, a framework designed for building applications with language models. We'll guide you through training a model, specifically the OpenAI Chat GPT, using your selected private data. While we provide a structured tutorial, feel free to adapt the steps based on your dataset and model preferences, as variations are expected and encouraged. Additionally, we'll offer various feature alternatives that you can incorporate throughout the tutorial.The Need for Privacy in Question AnsweringUndoubtedly, the confidentiality of personalized data is of absolute importance. While companies amass vast amounts of data daily, which offers them invaluable insights, it's crucial to safeguard this information. Disclosing such proprietary information to external entities could jeopardize the company's competitive edge and overall business integrity.How Does the Fine-Tuning LangChain Process WorkStep 1: Identifying The Appropriate Data SourceBefore selecting the right dataset, it's essential to ask a few preliminary questions. What specific topic are you aiming to inquire about? Is the data set sufficient? And such.Step 2: Integrating The Data With LangChainBased on the dataset's file format you have, you'll need to adopt different methods to effectively import the data into LangChain.Step 3: Splitting The Data Into ChunksTo ensure efficient data processing, it's crucial to divide the dataset into smaller segments, often referred to as chunks.Step 4: Transforming The Data Into EmbeddingsEmbedding is a technique where words or phrases from the vocabulary are mapped to vectors of real numbers. The idea behind embeddings is to capture the semantic meaning and relationships of words in a lower-dimensional space than the original representation.Step 5: Asking Queries To Our ModelFinally, after training our model on the updated documentation, we can directly query it for any information we require.Full LangChain ProcessDataSet UsedLangChain's versatility stems from its ability to process varied datasets. For our demonstration, we utilize the "Giskard Documentation", a comprehensive guide on the Giskard framework.Giskard is an open-source testing framework for Machine Learning models, spanning various Python model types. It automatically detects vulnerabilities in ML models, generates domain-specific tests, and integrates open-source QA best practices.Having said that, LangChain can seamlessly integrate with a myriad of other data sources, be they textual, tabular, or even multimedia, expanding its use-case horizons.Setting Up and Using LangChain for Private Question AnsweringStep 1: Installing The Necessary LibrariesAllows the first step of building any machine learning model, we will have to set up our environment, making sure!pip install langchain !pip install openai !pip install pypdf !pip install tiktoken !pip install faiss-gpuStep 2: Importing Necessary Librariesfrom langchain.llms import OpenAI from langchain.chat_models import ChatOpenAI from langchain.document_loaders import PyPDFLoader from langchain.vectorstores import FAISS from langchain.embeddings.openai import OpenAIEmbeddings from langchain.chains import LLMChain from langchain.prompts import PromptTemplate from langchain.chains.qa_with_sources import load_qa_with_sources_chain import openai import osStep 3: Importing OpenAI API Keyos.environ['OPENAI_API_KEY'] = "Insert your OpenAI key here"Step 4: Loading Our Data SetLandChain offers the capability to load data in various formats. In this article, we'll focus on loading data in PDF format but will also touch upon other popular formats such as CSV and File Directory. For details on other file formats, please refer to the LangChain Documentation.Loading PDF DataWe've compiled the Giskard AI tool's documentation into a PDF and subsequently partitioned the data.loader = PyPDFLoader("/kaggle/input/giskard-documentation/Giskard Documentation.pdf")pages = loader.load_and_split()Below are the code snippets if you prefer to work with either CSV or File Directory file formats.Loading CSV Datafrom langchain.document_loaders.csv_loader import CSVLoader loader = CSVLoader(“Insert the path to your CSV dataset here”) data = loader.load()Loading File Directory Datafrom langchain.document_loaders import DirectoryLoader loader = DirectoryLoader('../', glob="**/*.md") docs = loader.load()Step 5: Indexing The DatasetWe will be creating an index using FAISS (Facebook AI Similarity Search), which is a library developed by Facebook AI for efficiently searching similarities in large datasets, especially used with vectors from machine learning models.We will be converting those documents into vector embeddings using OpenAIEmbeddings(). This indexed data can then be used for efficient similarity searches later on.faiss_index = FAISS.from_documents(pages, OpenAIEmbeddings())Here are some alternative indexing options you might consider.Indexing using Pineconeimport osimport pineconefrom langchain.schema import Documentfrom langchain.embeddings.openai import OpenAIEmbeddingsfrom langchain.vectorstores import Pinecone pinecone.init(    api_key=os.environ["PINECONE_API_KEY"], environment=os.environ["PINECONE_ENV"]) embeddings = OpenAIEmbeddings() pinecone.create_index("langchain-self-retriever-demo", dimension=1536)Indexing using Chromaimport os import getpass from langchain.schema import Document from langchain.embeddings.openai import OpenAIEmbeddings from langchain.vectorstores import Chroma os.environ["OPENAI_API_KEY"] = getpass.getpass("OpenAI API Key:") embeddings = OpenAIEmbeddings() vectorstore = Chroma.from_documents(docs, embeddings)Step 6: Asking The Model Some QuestionsThere are multiple methods by which we can retrieve our data from our model.Similarity SearchIn the context of large language models (LLMs) and natural language processing, similarity search is often about finding sentences, paragraphs, or documents that are semantically similar to a given sentence or piece of text.query = "What is Giskard?" docs = faiss_index.similarity_search(query) print(docs[0].page_content)Similarity Search Output: Why Giskard?Giskard is an open-source testing framework dedicated to ML models, covering any Python model, from tabular to LLMs.Testing Machine Learning applications can be tedious. Since ML models depend on data, testing scenarios depend on the domain specificities and are often infinite. Where to start testing? Which tests to implement? What issues to cover? How to implement the tests?At Giskard, we believe that Machine Learning needs its own testing framework. Created by ML engineers for ML engineers, Giskard enables you to:Scan your model to find dozens of hidden vulnerabilities: The Giskard scan automatically detects vulnerability issues such as performance bias, data leakage, unrobustness, spurious correlation, overconfidence, underconfidence, unethical issue, etc. Instantaneously generate domain-specific tests: Giskard automatically generates relevant tests based on the vulnerabilities detected by the scan. You can easily customize the tests depending on your use case by defining domain-specific data slicers and transformers as fixtures of your test suites.Leverage the Quality Assurance best practices of the open-source community: The Giskard catalog enables you to easily contribute and load data slicing & transformation functions such as AI-based detectors (toxicity, hate, etc.), generators (typos, paraphraser, etc.), or evaluators. Inspired by the Hugging Face philosophy, the aim of Giskard is to become.LLM Chainsmodel = OpenAI(model_name="gpt-3.5-turbo") my_chain = load_qa_with_sources_chain(model, chain_type="refine") query = "What is Giskard?" documents = faiss_index.similarity_search(query) result = my_chain({"input_documents": pages, "question": query})LLM Chain Output:Based on the additional context provided, Giskard is a Python package or library that provides tools for wrapping machine learning models, testing, debugging, and inspection. It supports models from various machine learning libraries such as HuggingFace, PyTorch, TensorFlow, or Scikit-learn. Giskard can handle classification, regression, and text generation tasks using tabular or text data.One notable feature of Giskard is the ability to upload models to the Giskard server. Uploading models to the server allows users to compare their models with others using a test suite, gather feedback from colleagues, debug models effectively in case of test failures, and develop new tests incorporating additional domain knowledge. This feature enables collaborative model evaluation and improvement. It is worth highlighting that the provided context mentions additional ML libraries, including Langchain, API REST, and LightGBM, but their specific integration with Giskard is not clearly defined.Sources:Giskard Documentation.pdfAPI Reference (for Dataset methods)Kaggle: /kaggle/input/giskard-documentation/Giskard Documentation.pdfConclusionLangChain effectively bridges the gap between advanced language models and the need for data privacy. Throughout this article, we have highlighted its capability to train models on private data, ensuring both insightful results and data security. One thing is for sure though, as AI continues to grow, tools like LangChain will be essential for balancing innovation with user trust.Author BioMostafa Ibrahim is a dedicated software engineer based in London, where he works in the dynamic field of Fintech. His professional journey is driven by a passion for cutting-edge technologies, particularly in the realms of machine learning and bioinformatics. When he's not immersed in coding or data analysis, Mostafa loves to travel.Medium
Read more
  • 0
  • 0
  • 14907

article-image-ai-distilled-28-unveiling-innovations-reshaping-our-world
Merlyn Shelley
11 Dec 2023
13 min read
Save for later

AI_Distilled #28: Unveiling Innovations Reshaping Our World

Merlyn Shelley
11 Dec 2023
13 min read
Dive deeper into the world of AI innovation and stay ahead of the AI curve! Subscribe to our AI_Distilled newsletter for the latest insights. Don't miss out – sign up today!👋 Hello ,“Generative AI has the potential to change the world in ways that we can’t even imagine. It has the power to create new ideas, products, and services that will make our lives easier, more productive, and more creative. It also has the potential to solve some of the world’s biggest problems, such as climate change, poverty, and disease.” -Bill Gates, Microsoft Co-Founder Microsoft Bing’s new Deep Search functionality is a case in point — Bing will now create AI prompts itself to provide detailed insights to user queries in ways traditional search engines can’t even match. Who could have thought LLMs would progress so much they would eventually prompt themselves? Even Runway ML is onto something big with its groundbreaking technology that creates realistic AI generated videos that will find their way to Hollywood. Welcome back to a new issue of AI Distilled - your one-stop destination for all things AI, ML, NLP, and Gen AI. Let’s get started with the latest news and developments across the AI sector:  Elon Musk's xAI Initiates $1 Billion Funding Drive in AI Race Bing’s New Deep Search Expands Queries AI Takes Center Stage in 2023 Word of the Year Lists OpenAI Announces Delay in GPT Store Launch to Next Year ChatGPT Celebrates First Anniversary with 110M Installs and $30M Revenue Milestone Runway ML and Getty Images Collaborate on AI Video Models for Hollywood and Advertising We’ve also curated the latest GPT and LLM resources, tutorials, and secret knowledge: Unlocking AI Magic: A Primer on 7 Essential Libraries for Developers Efficient LLM Fine-Tuning with QLoRA on a Laptop Rapid Deployment of Large Open Source LLMs with Runpod and vLLM’s OpenAI Endpoint Understanding Strategies to Enhance Retrieval-Augmented Generation (RAG) Pipeline Performance Understanding and Mitigating Biases and Toxicity in LLMs Finally, don’t forget to check-out our hands-on tips and strategies from the AI community for you to use on your own projects: A Step-by-Step Guide to Streamlining LLM Data Processing for Efficient Pipelines Fine-Tuning Mistral Instruct 7B on the MedMCQA Dataset Using QLoRA Accelerating Large-Scale Training: A Comprehensive Guide to Amazon SageMaker Data Parallel Library Enhancing LoRA-Based Inference Speed: A Guide to Efficient LoRA Decomposition Looking for some inspiration? Here are some GitHub repositories to get your projects going! tacju/maxtron Tanuki/tanuki.py roboflow/multimodal-maestro 03axdov/muskie Also, don't forget to check our expert insights column, which covers the interesting concepts of NLP from the book 'The Handbook of NLP with Gensim'. It's a must-read!    Stay curious and gear up for an intellectually enriching experience! 📥 Feedback on the Weekly EditionQuick question: How can we foster effective collaboration between humans and AI systems, ensuring that AI complements human skills and enhances productivity without causing job displacement or widening societal gaps?Share your valued opinions discreetly! Your insights could shine in our next issue for the 39K-strong AI community. Join the conversation! 🗨️✨ As a big thanks, get our bestselling "Interactive Data Visualization with Python - Second Edition" in PDF.  Let's make AI_Distilled even more awesome! 🚀 Jump on in! Share your thoughts and opinions here! Writer’s Credit: Special shout-out to Vidhu Jain for their valuable contribution to this week’s newsletter content!  Cheers,  Merlyn Shelley  Editor-in-Chief, Packt  SignUp | Advertise | Archives⚡ TechWave: AI/GPT News & Analysis🏐 Elon Musk's xAI Initiates $1 Billion Funding Drive in AI Race: xAI is on a quest to secure $1 billion in equity, aiming to stay competitive with tech giants like OpenAI, Microsoft, and Google in the dynamic AI landscape. Already amassing $135 million from investors, xAI's total funding goal is disclosed in a filing with the US Securities and Exchange Commission.  🏐 AI Alliance Launched by Tech Giants IBM and Meta: IBM and Meta have formed a new "AI Alliance" with over 50 partners to promote open and responsible AI development. Members include Dell, Intel, CERN, NASA and Sony. The alliance envisions fostering an open AI community for researchers and developers and can help members make progress if they openly share models or not. 🏐 Bing’s New Deep Search Expands Queries: Microsoft is testing a new Bing feature called Deep Search that uses GPT-4 to expand search queries before providing results. Deep Search displays the expanded topics in a panel for users to select the one that best fits what they want to know. It then tailors the search results to that description. Microsoft says the feature can take up to 30 seconds due to the AI generation. 🏐 AI Takes Center Stage in 2023 Word of the Year Lists: In 2023, AI dominates tech, influencing "word of the year" choices. Cambridge picks "hallucinate" for AI's tendency to invent information; Merriam-Webster chooses "authentic" to address AI's impact on reality. Oxford recognizes "prompt" for its evolved role in instructing generative AI, reflecting society's increased integration of AI into everyday language and culture. 🏐 OpenAI Announces Delay in GPT Store Launch to Next Year: OpenAI delays the GPT store release until next year, citing unexpected challenges and postponing the initial December launch plan. Despite recent challenges, including CEO changes and employee unrest, development continues, and updates for ChatGPT are expected. The GPT store aims to be a marketplace for users to sell and share custom GPTs, with creators compensated based on usage. 🏐 ChatGPT Celebrates First Anniversary with 110M Installs and $30M Revenue Milestone: ChatGPT's mobile apps, launched in May 2023 on iOS and later on Android, have exceeded 110 million installs, yielding nearly $30 million in revenue. The success is fueled by the ChatGPT Plus subscription, offering perks. Despite competition, downloads surge, with Android hitting 18 million in a week. The company expects continued growth by year-end 2023. 🏐 Runway ML and Getty Images Collaborate on AI Video Models for Hollywood and Advertising: NYC video AI startup Runway ML, backed by Google and NVIDIA, announces a partnership with Getty Images for the Runway <> Getty Images Model (RGM), a generative AI video model. Targeting Hollywood, advertising, media, and broadcasting, it enables customized content workflows for Runway enterprise customers. 🔮 Expert Insights from Packt Community The Handbook of NLP with Gensim - By Chris Kuo NLU + NLG = NLP NLP is an umbrella term that covers natural language understanding (NLU) and NLG. We’ll go through both in the next sections. NLU Many languages, such as English, German, and Chinese, have been developing for hundreds of years and continue to evolve. Humans can use languages artfully in various social contexts. Now, we are asking a computer to understand human language. What’s very rudimentary to us may not be so apparent to a computer. Linguists have contributed much to the development of computers’ understanding in terms of syntax, semantics, phonology, morphology, and pragmatics. NLU focuses on understanding the meaning of human language. It extracts text or speech input and then analyzes the syntax, semantics, phonology, morphology, and pragmatics in the language. Let’s briefly go over each one: Syntax: This is about the study of how words are arranged to form phrases and clauses, as well as the use of punctuation, order of words, and sentences. Semantics: This is about the possible meanings of a sentence based on the interactions between words in the sentence. It is concerned with the interpretation of language, rather than its form or structure. For example, the word “table” as a noun can refer to “a piece of furniture having a smooth flat top that is usually supported by one or more vertical legs” or a data frame in a computer language. NLU can understand the two meanings of a word in such jokes through a technique called word embedding.  Phonology: This is about the study of the sound system of a language, including the sounds of speech (phonemes), how they are combined to form words (morphology), and how they are organized into larger units such as syllables and stress patterns. For example, the sounds represented by the letters “p” and “b” in English are distinct phonemes. A phoneme is the smallest unit of sound in a language that can change the meaning of a word. Consider the words “pat” and “bat.” The only difference between these two words is the initial sound, but their meanings are different. Morphology: This is the study of the structure of words, including the way in which they are formed from smaller units of meaning called morphemes. It originally comes from “morph,” the shape or form, and “ology,” the study of something. Morphology is important because it helps us understand how words are formed and how they relate to each other. It also helps us understand how words change over time and how they are related to other words in a language. For example, the word “unkindness” consists of three separate morphemes: the prefix “un-,” the root “kind,” and the suffix “-ness.” Pragmatics: This is the study of how language is used in a social context. Pragmatics is important because it helps us understand how language works in real-world situations, and how language can be used to convey meaning and achieve specific purposes. For example, if you offer to buy your friend a McDonald’s burger, a large fries, and a large drink, your friend may reply "no" because he is concerned about becoming fat. Your friend may simply mean the burger meal is high in calories, but the conversation can also imply he may be fat in a social context. Now, let’s understand NLG. NLG While NLU is concerned with reading for a computer to comprehend, NLG is about writing for a computer to write. The term generation in NLG refers to an NLP model generating meaningful words or even articles. Today, when you compose an email or type a sentence in an app, it presents possible words to complete your sentence or performs automatic correction. These are applications of NLG.  This content is from the book The Handbook of NLP with Gensim - By Chris Kuo (Oct 2023). Start reading a free chapter or access the entire Packt digital library free for 7 days by signing up now. To learn more, click on the button below. Read through the Chapter 1 unlocked here...  🌟 Secret Knowledge: AI/LLM Resources🏀 Unlocking AI Magic: A Primer on 7 Essential Libraries for Developers: Discover seven cutting-edge libraries to enhance development projects with advanced AI features. From CopilotTextarea for AI-driven writing in React apps to PrivateGPT for secure, locally processed document interactions, explore tools that elevate your projects and impress users. 🏀 Efficient LLM Fine-Tuning with QLoRA on a Laptop: Explore QLoRA, an efficient memory-saving method for fine-tuning large language models on ordinary CPUs. The QLoRA API supports NF4, FP4, INT4, and INT8 data types for quantization, utilizing methods like LoRA and gradient checkpointing to significantly reduce memory requirements. Learn to implement QLoRA on CPUs, leveraging Intel Extension for Transformers, with experiments showcasing its efficiency on consumer-level CPUs. 🏀 Rapid Deployment of Large Open Source LLMs with Runpod and vLLM’s OpenAI Endpoint: Learn to swiftly deploy open-source LLMs into applications with a tutorial, featuring the Llama-2 70B model and AutoGen framework. Utilize tools like Runpod and vLLM for computational resources and API endpoint creation, with a step-by-step guide and the option for non-gated models like Falcon-40B. 🏀 Understanding Strategies to Enhance Retrieval-Augmented Generation (RAG) Pipeline Performance: Learn optimization techniques for RAG applications by focusing on hyperparameters, tuning strategies, data ingestion, and pipeline preparation. Explore improvements in inferencing through query transformations, retrieval parameters, advanced strategies, re-ranking models, LLMs, and prompt engineering for enhanced retrieval and generation. 🏀 Understanding and Mitigating Biases and Toxicity in LLMs: Explore the impact of ethical guidelines on Large Language Model (LLM) development, examining measures adopted by companies like OpenAI and Google to address biases and toxicity. Research covers content generation, jailbreaking, and biases in diverse domains, revealing complexities and challenges in ensuring ethical LLMs.  🔛 Masterclass: AI/LLM Tutorials🎯 A Step-by-Step Guide to Streamlining LLM Data Processing for Efficient Pipelines: Learn to optimize the development loop for your LLM-powered recommendation system by addressing slow processing times in data pipelines. The solution involves implementing a Pipeline class to save inputs/outputs, enabling efficient error debugging. Enhance developer experience with individual pipeline stages as functions and consider future optimizations like error classes and concurrency. 🎯 Fine-Tuning Mistral Instruct 7B on the MedMCQA Dataset Using QLoRA: Explore fine-tuning Mistral Instruct 7B, an open-source LLM, for medical entrance exam questions using the MedMCQA dataset. Utilize Google Colab, GPTQ version, and LoRA technique for memory efficiency. The tutorial covers data loading, prompt creation, configuration, training setup, code snippets, and performance evaluation, offering a foundation for experimentation and enhancement. 🎯 Accelerating Large-Scale Training: A Comprehensive Guide to Amazon SageMaker Data Parallel Library: This guide details ways to boost Large Language Model (LLM) training speed with Amazon SageMaker's SMDDP. It addresses challenges in distributed training, emphasizing SMDDP's optimized AllGather for GPU communication bottleneck, exploring techniques like EFA network usage, GDRCopy coordination, and reduced GPU streaming multiprocessors for improved efficiency and cost-effectiveness on Amazon SageMaker. 🎯 Enhancing LoRA-Based Inference Speed: A Guide to Efficient LoRA Decomposition: The article highlights achieving three times faster inference for public LoRAs using the Diffusers library. It introduces LoRA, a parameter-efficient fine-tuning technique, detailing its decomposition process and benefits, including quick transitions and reduced warm-up and response times in the Inference API.  🚀 HackHub: Trending AI Tools⚽ tacju/maxtron: Unified meta-architecture for video segmentation, enhancing clip-level segmenters with within-clip and cross-clip tracking modules. ⚽ Tanuki/tanuki.py: Simplifies the creation of apps powered by LLMs in Python by seamlessly integrating well-typed, reliable, and stateless LLM-powered functions into applications. ⚽ roboflow/multimodal-maestro: Empowers developers with enhanced control over large multimodal models, enabling the achievement of diverse outputs through effective prompting tactics. ⚽ 03axdov/muskie: Python-based ML library that simplifies the process of dataset creation and model utilization, aiming to reduce code complexity. 
Read more
  • 0
  • 0
  • 14618

article-image-google-bard-everything-you-need-to-know-to-get-started
Sangita Mahala
05 Oct 2023
6 min read
Save for later

Google Bard: Everything You Need to Know to Get Started

Sangita Mahala
05 Oct 2023
6 min read
Dive deeper into the world of AI innovation and stay ahead of the AI curve! Subscribe to our AI_Distilled newsletter for the latest insights. Don't miss out – sign up today!IntroductionGoogle Bard, a generative AI conversational chatbot. Initially from the LaMDA family of large language models and later the PaLM LLMs. It will follow your instructions and complete your requests in a better way. Bard uses its knowledge to answer your questions in an informative manner. It can generate different creative text formats, like poems, code, scripts, emails, letters, etc. Currently, it is available in 238 countries and 46 languages.Bard is a powerful tool which can be used for many different things, including:Writing and editing contentResearching and learning new thingsTranslating languagesGenerating new creative ideasAnswering questionsWhat is Google Bard and How does it work?Bard is a large language model, commonly referred to as a chatbot or conversational AI, that has been programmed to be extensive and informative. It is able to communicate and generate human-like text in response to a wide range of prompts and questions.Bard operates by using a method known as deep learning. Artificial neural networks are used in deep learning to learn from data. Deep learning is a subset of machine learning. The structure and operation of the human brain served as the inspiration for neural networks, which are capable of learning intricate patterns from data.The following illustrates how Bard works:You enter a command or query into Bard.The input or query is processed by Bard's neural network, which then produces a response.The reply from Bard is then shown to you.How to get started with Bard?It’s pretty simple and easy to get started using Google Bard. The following steps will let you know how you will be able to start your journey in Google Bard.Step-1Go to the Google Bard website by clicking this link: https://bard.google.com/Step-2Go to the top right corner of your screen and then click on the “Sign in” button.Step-3Once you are signed in, you can start typing your prompts or queries into the text box at the bottom of the screen.Step-4Your prompt or query will trigger Bard to produce an answer, which you may then read and evaluate.For Example:You can provide the prompt to Google Bard such as “Write a 'Hello, World!' program in the Rust programming language.”Prompt:Common Bard commands and how to use themGoogle Bard does not have any specific commands, but you can use certain keywords and phrases to get Bard to perform certain tasks. For example, you can use the following keywords and phrases to get Bard to:Create many creative text forms, such as "Write a script for...", "Write a poem about...", and "Write a code for...".Bard will perfectly answer your questions in a comprehensive and informative way: "What is the capital of India?", "How do I build a website?", "What are the benefits of using Bard?" Examples of what Bard can do and how to use it for specific tasksHere are some examples of what Bard can do and how to use it for specific tasks:Generate creative text formats: Bard allows you to generate a variety of unique text formats, including code, poems, emails, and letters. You have to simply input the required format, followed by a question or query, to get the things done. For example, to generate an email to your manager requesting a raise in salary, you would type “Write an email to your manager asking for a raise."Prompt:Answer your questions in a comprehensive and informative way: No matter how difficult or complex your query might be, Bard can help you to find the solution. So you have to simply enter your query into the text box, and Bard will generate a response likewise. For example, to ask Bard what is the National Flower of India is, you would type "What is the National Flower of India?".Prompt: Translate languages: Bard will allow to convert text between different languages. To do this, simply type the text that you want to translate into the text box, followed by the provided language from your side. For example, to translate the sentence: I am going to the store to buy some groceries into Hindi, you would type "I am going to the store to buy some groceries to Hindi".Prompt:How Bard Can Be Used for Different PurposesA writer can use Bard to generate new concepts for fresh stories or articles or to summarize research results.Students can utilize Bard to create essays and reports, get assistance with their assignments, and master new topics.A business owner can use Bard to produce marketing materials, develop chatbots for customer support, or perform data analysis.Software developers can use Bard to produce code, debug applications, or find solutions for technical issues.The future of Google BardGoogle Bard is probably going to get increasingly more capable and adaptable as it keeps developing and acquiring knowledge. It might be utilized to produce new works of art and entertainment, advance scientific research, and provide solutions to some of the most critical global problems.It's also crucial to remember that Google Bard is not the only significant language model being developed. Similar technologies are being created as well by various other companies, such as Microsoft and OpenAI. This competition is likely to drive innovation and lead to even more powerful and sophisticated language models in the future.Overall, the future of Google Bard and other substantial language models seems quite bright overall. These innovations have the power to completely transform the way we study, work, and produce. There is no doubt that these technologies have the potential to improve the world, but it is necessary to use them effectively.ConclusionGoogle Bard is a powerful AI tool that has the potential to be very beneficial, including writing and editing content, researching and learning new things, translating languages, generating new creative ideas, and answering questions. Being more productive and saving time are two of the greatest advantages of utilizing Google Bard. This can free up your time so that you can concentrate on other things, including developing new ideas or enhancing your skills. Bard can assist you in finding and understanding the data quickly and easily because it has access to a wide amount of information. It has the potential to be a game-changer for many people. If you are looking for a way to be more productive, I encourage you to try using Bard.Author BioSangita Mahala is a passionate IT professional with an outstanding track record, having an impressive array of certifications, including 12x Microsoft, 11x GCP, 2x Oracle, and LinkedIn Marketing Insider Certified. She is a Google Crowdsource Influencer and IBM champion learner gold. She also possesses extensive experience as a technical content writer and accomplished book blogger. She is always Committed to staying with emerging trends and technologies in the IT sector.
Read more
  • 0
  • 0
  • 14591

article-image-llms-for-extractive-summarization-in-nlp
Mostafa Ibrahim
20 Nov 2023
7 min read
Save for later

LLMs For Extractive Summarization in NLP

Mostafa Ibrahim
20 Nov 2023
7 min read
Dive deeper into the world of AI innovation and stay ahead of the AI curve! Subscribe to our AI_Distilled newsletter for the latest insights. Don't miss out – sign up today!SourceIntroductionIn today's era, filtering out vital information from the overwhelming volume of data has become crucial. As we navigate vast amounts of information, the significance of adept text summarization becomes clear. This process not only conserves our time but also optimizes the use of resources, ensuring we focus on what truly matters.                                                                                                              SourceIn this article, we will delve into the intricacies of text summarization, particularly focusing on the role of Large Language Models (LLMs) in the process. We'll explore their foundational principles, their capabilities in extractive summarization, and the advanced techniques they deploy. Moreover, we'll shed light on the challenges they face and the innovative solutions proposed to overcome them. Without further ado let’s dive in!What are LLMs?LLMs, standing for Large Language Models, are intricate computational structures designed for the detailed analysis and understanding of text. They fall under the realm of Natural Language Processing, a domain dedicated to enabling machines to interpret human language. One of the distinguishing features of LLMs is their vast scale, equipped with an abundance of parameters that facilitate the storage of extensive linguistic data. In the context of summarization, two primary techniques emerge: extractive and abstractive. Extractive summarization involves selecting pertinent sentences or phrases directly from the source material, whereas abstractive summarization synthesizes new sentences that encapsulate the core message in a more condensed manner. With their advanced linguistic comprehension, LLMs are instrumental in both methods, but their proficiency in extractive summarization is notably prominent.Why Utilize LLMs for Extractive Summarization?Extractive summarization entails selecting crucial sentences or phrases from a source document to compose a concise summary. Achieving this demands an intricate and thorough grasp of the document's content, especially when it pertains to extensive and multifaceted texts.The expansive architecture of LLMs, including state-of-the-art models like ChatGPT, grants them the capability to process and analyze substantial volumes of text, surpassing the limitations of smaller models like BERT which can handle only 512 tokens. This considerable size and intricate design allow LLMs to produce richer and more detailed representations of content.LLMs excel not only in recognizing the overt details but also in discerning the implicit or subtle nuances embedded within a text. Given their profound understanding, LLMs are uniquely positioned to identify and highlight the sentences or phrases that truly encapsulate the essence of any content, making them indispensable tools for high-quality extractive summarization.Techniques and Approaches with LLMsWithin the realm of Natural Language Processing (NLP), the deployment of specific techniques to distill vast texts into concise summaries is of paramount importance. One such technique is sentence scoring. In this method, each sentence in a document is assigned a quantitative value, representing its relevance and importance. LLMs, owing to their extensive architectures, can be meticulously fine-tuned to carry out this scoring with high precision, ensuring that only the most pertinent content is selected for summarization.Next, we turn our attention to the attention visualization in LLMs. This technique provides a graphical representation of the segments of text to which the model allocates the most significance during processing. For extractive summarization, this visualization serves as a crucial tool, as it offers insights into which sections of the text the model deems most relevant.Lastly, the integration of hierarchical models enhances the capabilities of LLMs further. These models approach texts in a structured manner, segmenting them into defined chunks before processing each segment for summarization. The inherent capability of LLMs to process lengthy sequences means they can operate efficiently at both the segmentation and the summarization stages, ensuring a comprehensive analysis of extended documents.Practical Implementation of Extractive Summarization Using LLMsIn this section, we offer a hands-on experience by providing a sample code snippet that utilizes a pre-trained Large Language Model known as bert for text summarization. In order to specify extractive summarization we will be using the bert-extractive-summarizer package, which is an extension of the Hugging Face Transformers library. This package provides a simple way to use BERT for extractive summarization.Step 1: Install and Import Nesseccary Libraries!pip install bert-extractive-summarizer from summarizer import SummarizerStep 2: Load the Extractive Bert Summarization ModelIn our case, the LLM of choice is the t5 large model.model = Summarizer()Step 3:  Create a Sample Text to Summarizetext = """Climate change represents one of the most significant challenges facing the world today. It is characterized by changes in weather patterns, rising global temperatures, and increasing levels of greenhouse gases in the atmosphere. The impact of climate change is far-reaching, affecting ecosystems, biodiversity, and human societies across the globe. Scientists warn that immediate action is necessary to mitigate the most severe consequences of this global phenomenon. Strategies to address climate change include reducing carbon emissions, transitioning to renewable energy sources, and conserving natural habitats. International cooperation is crucial, as the effects of climate change transcend national borders, requiring a unified global response. The Paris Agreement, signed by 196 parties at the COP 21 in Paris on 12 December 2015, is one of the most comprehensive international efforts to combat climate change, aiming to limit global warming to well below 2 degrees Celsius."""Step 4: Performing Extractive SummarizationIn this step, we'll be performing extractive summarization, explicitly instructing the model to generate a summary consisting of the two sentences deemed most significant.summary = model(text, num_sentences=2)  # You can specify the number of sentences in the summary print("Extractive Summary:") print(summary)Output for Extractive Summary: Climate change represents one of the most significant challenges facing the world today. The impact of climate change is far-reaching, affecting ecosystems, biodiversity, and human societies across the globe.Challenges and Overcoming ThemThe journey of extractive summarization using LLMs is not without its bumps. A significant challenge is redundancy. Extractive models, in their quest to capture important sentences, might pick multiple sentences conveying similar information, leading to repetitive summaries.Then there's the issue of coherency. Unlike abstractive summarization, where models generate summaries, extractive methods merely extract. The outcome might not always flow logically, hindering a reader's understanding and detracting from the quality.To combat these challenges, refined training methods can be employed. Training data can be curated to include diverse sentence structures and content, pushing the model to discern nuances and reduce redundancy. Additionally, reinforcement learning techniques can be integrated, where the model is rewarded for producing non-redundant, coherent summaries and penalized for the opposite. Over time, through continuous feedback and iterative training, LLMs can be fine-tuned to generate crisp, non-redundant, and coherent extractive summaries.ConclusionIn conclusion, the realm of text summarization, enhanced by the capabilities of Large Language Models (LLMs), presents a dynamic and evolving landscape. Throughout this article, we've journeyed through the foundational aspects of LLMs, their prowess in extractive summarization, and the methodologies and techniques they adopt.While challenges persist, the continuous advancements in the field promise innovative solutions on the horizon. As we move forward, the relationship between LLMs and text summarization will undoubtedly shape the future of how we process and understand vast data volumes efficiently.Author BioMostafa Ibrahim is a dedicated software engineer based in London, where he works in the dynamic field of Fintech. His professional journey is driven by a passion for cutting-edge technologies, particularly in the realms of machine learning and bioinformatics. When he's not immersed in coding or data analysis, Mostafa loves to travel.Medium
Read more
  • 0
  • 0
  • 14125
Unlock access to the largest independent learning library in Tech for FREE!
Get unlimited access to 7500+ expert-authored eBooks and video courses covering every tech area you can think of.
Renews at $19.99/month. Cancel anytime
article-image-unleashing-the-potential-of-gpus-for-training-llms
Shankar Narayanan
22 Sep 2023
8 min read
Save for later

Unleashing the Potential of GPUs for Training LLMs

Shankar Narayanan
22 Sep 2023
8 min read
Dive deeper into the world of AI innovation and stay ahead of the AI curve! Subscribe to our AI_Distilled newsletter for the latest insights. Don't miss out – sign up today!IntroductionThere is no doubt about Language Models being the true marvels in the arena of artificial intelligence. These sophisticated systems have the power to manipulate human language, understand, and even generate with astonishing accuracy.However, one can often complain about the immense computational challenges beyond these medical abilities. For instance, LLM training requires the incorporation of complex mathematical operations along with the processing of vast data. This is where the Graphics Processing Units (GPU) come into play. It serves as the engine that helps to power the language magic.Let me take you through the GPU advancement and innovations to support the Language Model. Parallely, we will explore how Nvidia helps revolutionize the enterprise LLM use cases.Role of GPUs in LLMs To understand the significance of GPU, let us first understand the concept of LLM.What is LLM?LLM or Large Language Models are AI systems that help generate human language. They have various applications, including translation services, sentiment analysis, chatbots, and content generation. Generative Pre-trained Transformer or GPT models, including BERT and GPT3, are popular among every LLM.These models require training, including vast data sets with billions of phrases and words. The model learns to predict while mastering the nuances and structure of language. It is like an intricate puzzle that requires enormous computational power.The need for GPUsThe Graphics Processing Units are specifically designed to undergo parallel processing. This characteristic makes them applicable to train the LLMs. The GPU can tackle thousands of tasks simultaneously, unlike the Central Processing Unit or CPU, which excels at handling sequential tasks.The training of a Large Language Model is like a massive jigsaw puzzle. Each puzzle piece represents a smaller portion of the model's language understanding. Using a CPU could only help one to work on one of these pieces at a simple time. But with GPU, one could work on various pieces parallelly while speeding up the whole process.Besides, GPU offers high computational throughput that one requires for complex mathematical operations. Their competency lies in metric multiplication, one of the fundamentals of neural network training. All these attributes make GPU indispensable for deep learning tasks like LLMs.Here is one of the practical example of how GPU works in LLM training: (Python)import time import torch # Create a large random dataset data = torch.randn(100000, 1000) # Training with CPU start_time = time.time() for _ in range(100):    model_output = data.matmul(data) cpu_training_time = time.time() - start_time print(f"CPU Training Time: {cpu_training_time:.2f} seconds") # Training with GPU if torch.cuda.is_available():    data = data.cuda()    start_time = time.time()    for _ in range(100):        model_output = data.matmul(data)    gpu_training_time = time.time() - start_time    print(f"GPU Training Time: {gpu_training_time:.2f} seconds") else:    print("GPU not available.")GPU Advancements and LLMDue to the rising demands of LLMs and AI, GPU technology is evolving rapidly. These advancements, however, play a significant role in constituting the development of sophisticated language models.One such advancement is the increase in GPU memory capacity. Technically, the larger model requires more excellent memory to process massive data sets. Hence, modern GPUs offer substantial memory capacity, allowing researchers to build and train more substantial large language models.One of the critical aspects of training a Large Language Model is its speed. Sometimes, it can take months to prepare and train a large language model. But with the advent of faster GPU, things have changed dramatically. The quicker GPU reduces the training time and accelerates research and development. Apart from that, it also reduces the energy consumption that is often associated with training these large models.Let us explore the memory capacity of the GPU using a code snippet.(Python)import torch # Check GPU memory capacity if torch.cuda.is_available():    gpu_memory = torch.cuda.get_device_properties(0).total_memory    print(f"GPU Memory Capacity: {gpu_memory / (1024**3):.2f} GB") else:    print("GPU not available.")For the record, Nvidia's Tensor Core technology has been one of the game changers in this aspect. It accelerates one of the core operations in deep learning, i.e., the matrix computation process, allowing the LLMs to train faster and more efficiently.Using matrix Python and PYTorh, you can showcase the speedup with GPU processing.import time import torch # Create large random matrices matrix_size = 1000 cpu_matrix = torch.randn(matrix_size, matrix_size) gpu_matrix = torch.randn(matrix_size, matrix_size).cuda()  # Move to GPU # Perform matrix multiplication with CPU start_time = time.time() result_cpu = torch.matmul(cpu_matrix, cpu_matrix) cpu_time = time.time() - start_time # Perform matrix multiplication with GPU start_time = time.time() result_gpu = torch.matmul(gpu_matrix, gpu_matrix) gpu_time = time.time() - start_time print(f"CPU Matrix Multiplication Time: {cpu_time:.4f} seconds") print(f"GPU Matrix Multiplication Time: {gpu_time:.4f} seconds")Nvidia's Contribution to GPU InnovationRegarding GPU innovation, the presence of Nvidia cannot be denied. It has a long-standing commitment to Machine Learning and advancing AI. Hence, it is a natural ally for the large language model community.Here is how Tensor Cores can be utilized with PYTorch.import torch # Enable Tensor Cores (requires a compatible GPU) if torch.cuda.is_available():    torch.backends.cuda.matmul.allow_tf32 = True # Create a tensor x = torch.randn(4096, 4096, device="cuda") # Perform matrix multiplication using Tensor Cores result = torch.matmul(x, x)It is interesting to know that Nvidia's graphics processing unit has powered several breakthroughs in LLM and AI models. BERT and GPT3 are known to harness the computational might of Nvidia's Graphics Processing Unit to achieve remarkable capabilities. Nvidia's dedication to the Artificial Intelligence world encompasses power and efficiency. The design of the graphics processing unit handles every AI workload with optimal performance per watt. It makes Nvidia one of the eco-friendly options for Large Language Model training procedures.As part of AI-focused hardware and architecture, the Tensor Core technology enables efficient and faster deep learning. This technology is instrumental in pushing the boundaries of LLM research.Supporting Enterprise LLM Use-caseThe application of LLM has a far-fetched reach, extending beyond research, labs, and academia. Indeed, they have entered the enterprise world with a bang. From analyzing massive datasets for insights to automating customer support through chatbots, large language models are transforming how businesses operate.Here, the Nvidia Graphics Processing Unit supports the enterprise LLM use cases. Enterprises often require LLM to handle vast amounts of data in real-time. With optimized AI performance and parallel processing power, Nvidia's GPU can provide the needed acceleration for these applications.Various companies across industries are harnessing the Nvidia GPU for developing LLM-based solutions to automate tasks, provide better customer experiences, and enhance productivity. From healthcare organizations analyzing medical records to financial institutions and predicting market trends, Nvidia drives enterprise LLM innovations.ConclusionNvidia continues to be the trailblazer in the captivating journey of training large language models. They are not only the hardware muscle for LLM but constantly innovate to make GPU capable and efficient with each generation.LLM is on the run to become integral to our daily lives. From business solutions to personal assistants, Nvidia's commitment to its GPU innovation ensures more power to the growth of language models. The synergy between AI and Nvidia GPU is constantly shaping the future of enterprise LLM use cases, helping organizations to achieve new heights in innovation and efficiency.Frequently Asked Questions1. How does the GPU accelerate the training process of large language models?The Graphics Processing Unit has parallel processing capabilities to allow the work of multiple tasks simultaneously. Such parallelism helps train Large Language Models by efficiently processing many components in understanding and generating human language.2. How does Nvidia contribute to GPU innovation for significant language and AI models?Nvidia has developed specialized hardware, including Tensor Core, optimized for AI workloads. The graphic processing unit of Nvidia powered numerous AI breakthroughs while providing efficient AI hardware to advance the development of Large Language Models.3. What are the expectations for the future of GPU innovation and launch language model?The future of GPU innovation promises efficient, specialized, and robust hardware tailored to the needs of AI applications and Large Language Models. It will continuously drive the development of sophisticated language models while opening up new possibilities for AI-power solutions.Author BioShankar Narayanan (aka Shanky) has worked on numerous different cloud and emerging technologies like Azure, AWS, Google Cloud, IoT, Industry 4.0, and DevOps to name a few. He has led the architecture design and implementation for many Enterprise customers and helped enable them to break the barrier and take the first step towards a long and successful cloud journey. He was one of the early adopters of Microsoft Azure and Snowflake Data Cloud. Shanky likes to contribute back to the community. He contributes to open source is a frequently sought-after speaker and has delivered numerous talks on Microsoft Technologies and Snowflake. He is recognized as a Data Superhero by Snowflake and SAP Community Topic leader by SAP.
Read more
  • 0
  • 0
  • 13461

article-image-build-an-ai-based-personal-financial-advisor-with-langchain
Louis Owen
09 Oct 2023
11 min read
Save for later

Build an AI-based Personal Financial Advisor with LangChain

Louis Owen
09 Oct 2023
11 min read
Dive deeper into the world of AI innovation and stay ahead of the AI curve! Subscribe to our AI_Distilled newsletter for the latest insights. Don't miss out – sign up today!IntroductionManaging personal finances is a cumbersome process. Receipts, bank statements, credit card bills, and expense records accumulate quickly. Despite our best intentions, many of us often struggle to maintain a clear overview of our financial health. It's not uncommon to overlook expenses or procrastinate on updating our budgets. Inevitably, this leads to financial surprises and missed opportunities for financial growth.Even when we diligently track our expenses, the data can be overwhelming. Analyzing every transaction to identify patterns, pinpoint areas for improvement, and set financial goals is no easy feat. It's challenging to answer questions like, "Am I spending too much on entertainment?" or "Is my investment portfolio well-balanced?" or "Should I cut back on dining out?" or "Do I need to limit socializing with friends?" and "Is it time to increase my investments?"Imagine having a personal assistant capable of automating these financial tasks, effortlessly transforming your transaction history into valuable insights. What if, at the end of each month, you received comprehensive financial analyses and actionable recommendations? Thanks to the rapid advancements in generative AI, this dream has become a reality, made possible by the incredible capabilities of LLM. No more endless hours of spreadsheet tinkering or agonizing over budgeting apps.In this article, we'll delve into the development of a personal financial advisor powered by LangChain. This virtual assistant will not only automate the tracking of your finances but also provide tailored recommendations based on your unique spending patterns and financial goals.Building an AI-powered personal financial advisor is an exciting endeavor. Here's an overview of how the personal financial advisor operates:Data Input: Users upload their personal transaction history, which includes details of income, expenses, investments, and savings.Data Processing: LangChain with LLM in the backend will process the data, categorize expenses, identify trends, and compare your financial activity with your goals and benchmarks.Financial Analysis: The advisor generates a detailed financial analysis report, highlighting key insights such as spending habits, saving potential, and investment performance.Actionable Recommendations: The advisor will also provide actionable recommendations for the user. It can suggest adjustments to your budget, recommend investment strategies, and even propose long-term financial plans.The benefits of having an AI-powered personal financial advisor are numerous:Time-Saving: No more tedious data entry and manual budget tracking. The advisor handles it all, giving you more time for what matters most in your life.Personalized Insights: The advisor tailors recommendations based on your unique financial situation, ensuring they align with your goals and aspirations.Financial Confidence: With regular updates and guidance, you gain a better understanding of your financial health and feel more in control of your money.Long-Term Planning: The advisor’s ability to provide insights into long-term financial planning ensures you're well-prepared for your future.Without wasting any more time, let’s take a deep breath, make yourselves comfortable, and be ready to learn how to build your AI-based personal financial advisor with LangChain!What is LangChain?LangChain is developed to harness the incredible potential of LLM, LangChain enables the creation of applications that are not only context-aware but also capable of reasoning, all while maintaining a user-friendly and modular approach.LangChain is more than just a framework; it's a paradigm shift in the world of language model-driven applications. Here's a closer look at what makes LangChain a transformative force:Context-Aware Applications: LangChain empowers applications to be context-aware. This means that these applications can connect to various sources of context, such as prompt instructions, few-shot examples, or existing content, to enhance the depth and relevance of their responses. Whether you're seeking answers or recommendations, LangChain ensures that responses are firmly grounded in context.Reasoning Abilities: One of LangChain's standout features is its ability to reason effectively. It leverages language models to not only understand context but also make informed decisions. These decisions can range from determining how to answer a given question based on the provided context to deciding what actions to take next. LangChain doesn't just provide answers; it understands the "why" behind them.Why LangChain?The power of LangChain lies in its value propositions, which make it an indispensable tool for developers and businesses looking to harness the potential of language models:Modular Components: LangChain offers a comprehensive set of abstractions for working with language models. These abstractions are not only powerful but also modular, allowing developers to work with them seamlessly, whether they're using the entire LangChain framework or not. This modularity simplifies the development process and promotes code reuse.Off-the-Shelf Chains: LangChain provides pre-built, structured assemblies of components known as "off-the-shelf chains." These chains are designed for specific high-level tasks, making it incredibly easy for developers to kickstart their projects. Whether you're a seasoned AI developer or a newcomer, these pre-configured chains save time and effort.Customization and Scalability: While off-the-shelf chains are fantastic for quick starts, LangChain doesn't restrict you. The framework allows for extensive customization, enabling developers to tailor existing chains to their unique requirements or even create entirely new ones. This flexibility ensures that LangChain can accommodate a wide range of applications, from simple chatbots to complex AI systems.LangChain isn't just a run-of-the-mill framework; it's a versatile toolkit designed to empower developers to create sophisticated language model-powered applications. At the heart of LangChain is a set of interconnected modules, each serving a unique purpose. These modules are the building blocks that make LangChain a powerhouse for AI application development.Model I/O: At the core of LangChain's capabilities is its ability to interact seamlessly with language models. This module facilitates communication with these models, enabling developers to leverage their natural language processing prowess effortlessly.Retrieval: LangChain recognizes that real-world applications require access to relevant data. The Retrieval module allows developers to integrate application-specific data sources into their projects, enhancing the context and richness of responses.Chains: Building upon the previous modules, Chains bring structure and order to the development process. Developers can create sequences of calls, orchestrating interactions with language models and data sources to achieve specific tasks or goals.Agents: Let chains choose which tools to use given high-level directives. Agents take the concept of automation to a new level. They allow chains to make intelligent decisions about which tools to employ based on high-level directives. This level of autonomy streamlines complex processes and enhances application efficiency.Memory: Memory is vital for continuity in applications. This module enables LangChain applications to remember and retain their state between runs of a chain, ensuring a seamless user experience and efficient data handling.Callbacks: Transparency and monitoring are critical aspects of application development. Callbacks provide a mechanism to log and stream intermediate steps of any chain, offering insights into the inner workings of the application and facilitating debugging.Building the Personal Financial AdvisorLet’s start building our personal financial advisor with LangChain! For the sake of simplicity, let’s consider only three data sources: monthly credit card statements, bank account statements, and cash expense logs. The following is an example of the data format for each of the sources. ## Monthly Credit Card Statement Date: 2023-09-01 Description: Grocery Store Amount: $150.00 Balance: $2,850.00 Date: 2023-09-03 Description: Restaurant Dinner Amount: $50.00 Balance: $2,800.00 Date: 2023-09-10 Description: Gas Station Amount: $40.00 Balance: $2,760.00 Date: 2023-09-15 Description: Utility Bill Payment Amount: $100.00 Balance: $2,660.00 Date: 2023-09-20 Description: Salary Deposit Amount: $3,000.00 Balance: $5,660.00 Date: 2023-09-25 Description: Online Shopping Amount: $200.00 Balance: $5,460.00 Date: 2023-09-30 Description: Investment Portfolio Contribution Amount: $500.00 Balance: $4,960.00 ## Bank Account Statement Date: 2023-08-01 Description: Rent Payment Amount: $1,200.00 Balance: $2,800.00 Date: 2023-08-05 Description: Grocery Store Amount: $200.00 Balance: $2,600.00 Date: 2023-08-12 Description: Internet and Cable Bill Amount: $80.00 Balance: $2,520.00 Date: 2023-08-15 Description: Freelance Gig Income Amount: $700.00 Balance: $3,220.00 Date: 2023-08-21 Description: Dinner with Friends Amount: $80.00 Balance: $3,140.00 Date: 2023-08-25 Description: Savings Account Transfer Amount: $300.00 Balance: $3,440.00 Date: 2023-08-29 Description: Online Shopping Amount: $150.00 Balance: $3,290.00 ## Cash Expense Log Date: 2023-07-03 Description: Coffee Shop Amount: $5.00 Balance: $95.00 Date: 2023-07-10 Description: Movie Tickets Amount: $20.00 Balance: $75.00 Date: 2023-07-18 Description: Gym Membership Amount: $50.00 Balance: $25.00 Date: 2023-07-22 Description: Taxi Fare Amount: $30.00 Balance: -$5.00 (Negative balance indicates a debt) Date: 2023-07-28 Description: Bookstore Amount: $40.00 Balance: -$45.00 Date: 2023-07-30 Description: Cash Withdrawal Amount: $100.00 Balance: -$145.00To create our personal financial advisor, we’ll use the chat model interface provided by LangChain. There are several important components to build a chatbot with LangChain:`chat model`: Chat models are essential for creating conversational chatbots. These models are designed to generate human-like responses in a conversation. You can choose between chat models and LLMs (Large Language Models) depending on the tone and style you want for your chatbot. Chat models are well-suited for natural, interactive conversations.`prompt template`: Prompt templates help you construct prompts for your chatbot. They allow you to combine default messages, user input, chat history, and additional context to create meaningful and dynamic conversations. Using prompt templates makes it easier to generate responses that flow naturally in a conversation.`memory`: Memory in a chatbot context refers to the ability of the bot to remember information from previous parts of the conversation. This can be crucial for maintaining context and providing relevant responses. Memory types can vary depending on your use case, and they can include short-term and long-term memory.`retriever` (optional): Retrievers are components that help chatbots access domain-specific knowledge or retrieve information from external sources. If your chatbot needs to provide detailed, domain-specific information, a retriever can be a valuable addition to your system.First, we need to set the API key for our LLM. We’ll use OpenAI in this example.import os os.environ["OPENAI_API_KEY"] = “your openai key”Then, we can simply load the necessary chat modules from LangChain. from langchain.schema import ( AIMessage,    HumanMessage,    SystemMessage ) from langchain.chat_models import ChatOpenAIThe ChatOpenAI is the main class that connects with the OpenAI LLM. We can pass `HumanMessage` and `SystemMessage` to this class and it will return the response from the LLM in the type of `AIMessage`.chat = ChatOpenAI(model_name=”gpt-3.5-turbo”) messages = [SystemMessage(content=prompt),                    HumanMessage(content=data)] chat(messages)Let’s see the following example where we pass the prompt along with the data and the LLM returns the response via the ChatOpenAI object. Boom! We just got our first analysis and recommendation from our personal financial advisor. This is a very simple example of how to create our personal financial advisor. Of course, there’s still a lot of room for improvement. For example, currently, we need to pass manually the relevant data sources as the HumanMessage. However, as mentioned before, LangChain provides a built-in class to perform retrieval. This means that we can just create another script to automatically dump all of the relevant data into some sort of document or even database, and then LangChain can directly read the data directly from there. Hence, we can get automated reports every month without needing to manually input the relevant data.ConclusionCongratulations on keeping up to this point! Throughout this article, you have learned what is LangChain, what it is capable of, and how to build a personal financial advisor with LangChain. Hope the best for your experiment in creating your personal financial advisor and see you in the next article!Author BioLouis Owen is a data scientist/AI engineer from Indonesia who is always hungry for new knowledge. Throughout his career journey, he has worked in various fields of industry, including NGOs, e-commerce, conversational AI, OTA, Smart City, and FinTech. Outside of work, he loves to spend his time helping data science enthusiasts to become data scientists, either through his articles or through mentoring sessions. He also loves to spend his spare time doing his hobbies: watching movies and conducting side projects. Currently, Louis is an NLP Research Engineer at Yellow.ai, the world’s leading CX automation platform. Check out Louis’ website to learn more about him! Lastly, if you have any queries or any topics to be discussed, please reach out to Louis via LinkedIn.
Read more
  • 0
  • 0
  • 13447

article-image-building-an-api-for-language-model-inference-using-rust-and-hyper-part-1
Alan Bernardo Palacio
31 Aug 2023
7 min read
Save for later

Building an API for Language Model Inference using Rust and Hyper - Part 1

Alan Bernardo Palacio
31 Aug 2023
7 min read
IntroductionIn the landscape of artificial intelligence, the capacity to bring sophisticated Large Language Models (LLMs) to commonplace applications has always been a sought-after goal. Enter LLM, a groundbreaking Rust library crafted by Rustformers, designed to make this dream a tangible reality. By focusing on the intricate synergy between the LLM library and the foundational GGML project, this toolset pushes the boundaries of what's possible, enabling AI enthusiasts to harness the sheer might of LLMs on conventional CPUs. This shift in dynamics owes much to GGML's pioneering approach to model quantization, streamlining computational requirements without sacrificing performance.In this comprehensive guide, we'll embark on a journey that starts with understanding the essence of the llm crate and its seamless interaction with a myriad of LLMs. Delving into its intricacies, we'll illuminate how to integrate, interact, and infer using these models. And as a tantalizing glimpse into the realm of practical application, our expedition won't conclude here. In the subsequent installment, we'll rise to the challenge of crafting a web server in Rust—one that confidently runs inference directly on a CPU, making the awe-inspiring capabilities of AI not just accessible, but an integral part of our everyday digital experiences.This is a two-part article in the first section we will discuss the basic interaction with the library and in the following we build a server in Rust that allow us to build our own web applications using state-of-the-art LLMs. Let’s begin with it.Harnessing the Power of Large Language ModelsAt the very core of LLM's architecture resides the GGML project, a tensor library meticulously crafted in the C programming language. GGML, short for "General GPU Machine Learning," serves as the bedrock of LLM, enabling the intricate orchestration of large language models. Its quintessence lies in a potent technique known as model quantization.Model quantization, a pivotal process employed by GGML, involves the reduction of numerical precision within a machine-learning model. This entails transforming the conventional 32-bit floating-point numbers frequently used for calculations into more compact representations such as 16-bit or even 8-bit integers.Quantization can be considered as the act of chiseling away unnecessary complexities while sculpting a model. Model quantization adeptly streamlines resource utilization without inordinate compromises on performance. By default, models lean on 32-bit floating-point numbers for their arithmetic operations. With quantization, this intricacy is distilled into more frugal formats, such as 16-bit integers or even 8-bit integers. It's an artful equilibrium between computational efficiency and performance optimization.GGML's versatility can be seen through a spectrum of quantization strategies: spanning 4, 5, and 8-bit quantization. Each strategy allows for improvement in efficiency and execution in different ways. For instance, 4-bit quantization thrives in memory and computational frugality, although it could potentially induce a performance decrease compared to the broader 8-bit quantization.The Rustformers library allows to integration of different language models including Bloom, GPT-2, GPT-J, GPT-NeoX, Llama, and MPT. To use these models within the Rustformers library, they undergo a transformation to align with GGML's technical underpinnings. The authorship has generously provided pre-engineered models on the Hugging Face platform, facilitating seamless integration.In the next sections, we will use the llm crate to run inference on LLM models like Llama. The realm of AI innovation is beckoning, and Rustformers' LLM, fortified by GGML's techniques, forms an alluring gateway into its intricacies.Getting Started with LLM-CLIThe Rustformers group has the mission of amplifying access to the prowess of large language models (LLMs) at the forefront of AI evolution. The group focuses on harmonizing with the rapidly advancing GGML ecosystem – a C library harnessed for quantization, enabling the execution of LLMs on CPUs. The trajectory extends to supporting diverse backends, embracing GPUs, Wasm environments, and more.For Rust developers venturing into the realm of LLMs, the key to unlocking this potential is the llm crate – the gateway to Rustformers' innovation. Through this crate, Rust developers interface with LLMs effortlessly. The "llm" project also offers a streamlined CLI for interacting with LLMs and examples showcasing its integration into Rust projects. More insights can be gained from the GitHub repository or its official documentation for released versions.To embark on your LLM journey, initiate by installing the LLM-CLI package. This package materializes the model's essence onto your console, allowing for direct inference.Getting started is a streamlined process:Clone the repository.Install the llm-cli tool from the repository.Download your chosen models from Hugging Face. In our illustration, we employ the Llama model with 4-bit quantization.Run inference on the model using the CLI tool and reference the model and architecture of the model downloaded previously.So let’s start with it. First, let's install llm-cli using this command:cargo install llm-cli --git <https://github.com/rustformers/llm>Next, we proceed by fetching your desired model from Hugging Face:curl -LO <https://huggingface.co/rustformers/open-llama-ggml/resolve/main/open_llama_3b-f16.bin>Finally, we can initiate a dialogue with the model using a command akin to:llm infer -a llama -m open_llama_3b-f16.bin -p "Rust is a cool programming language because"We can see how the llm crate stands to facilitate seamless interactions with LLMs.This project empowers developers with streamlined CLI tools, exemplifying the LLM integration into Rust projects. With installation and model preparation effortlessly explained, the journey toward LLM proficiency commences. As we transition to the culmination of this exploration, the power of LLMs is within reach, ready to reshape the boundaries of AI engagement.Conclusion: The Dawn of Accessible AI with Rust and LLMIn this exploration, we've delved deep into the revolutionary Rust library, LLM, and its transformative potential to bring Large Language Models (LLMs) to the masses. No longer is the prowess of advanced AI models locked behind the gates of high-end GPU architectures. With the symbiotic relationship between the LLM library and the underlying GGML tensor architecture, we can seamlessly run language models on standard CPUs. This is made possible largely by the potent technique of model quantization, which GGML has incorporated. By optimizing the balance between computational efficiency and performance, models can now run in environments that were previously deemed infeasible.The Rustformers' dedication to the cause shines through their comprehensive toolset. Their offerings extend from pre-engineered models on Hugging Face, ensuring ease of integration, to a CLI tool that simplifies the very interaction with these models. For Rust developers, the horizon of AI integration has never seemed clearer or more accessible.As we wrap up this segment, it's evident that the paradigm of AI integration is rapidly shifting. With tools like the llm crate, developers are equipped with everything they need to harness the full might of LLMs in their Rust projects. But the journey doesn't stop here. In the next part of this series, we venture beyond the basics, and into the realm of practical application. Join us as we take a leap forward, constructing a web server in Rust that leverages the llm crate.Author BioAlan Bernardo Palacio is a data scientist and an engineer with vast experience in different engineering fields. His focus has been the development and application of state-of-the-art data products and algorithms in several industries. He has worked for companies such as Ernst and Young, and Globant, and now holds a data engineer position at Ebiquity Media helping the company to create a scalable data pipeline. Alan graduated with a Mechanical Engineering degree from the National University of Tucuman in 2015, participated as the founder of startups, and later on earned a Master's degree from the faculty of Mathematics at the Autonomous University of Barcelona in 2017. Originally from Argentina, he now works and resides in the Netherlands.LinkedIn 
Read more
  • 0
  • 0
  • 13246

article-image-detecting-anomalies-using-llm-sentence-embeddings
Alan Bernardo Palacio
21 Aug 2023
18 min read
Save for later

Detecting Anomalies Using LLM Sentence Embeddings

Alan Bernardo Palacio
21 Aug 2023
18 min read
IntroductionText classification tasks such as natural language inference (NLI) are a central part of modern natural language processing (NLP). In this article, we present an application of unsupervised machine learning techniques to detect anomalies in the MultiNLI dataset.Our aim is to use unsupervised Large Language Models (LLM) to create embeddings and discover patterns and relationships within the data. We'll preprocess the data, generate sentence pair embeddings, and use the Out-Of-Distribution (OOD) module from the cleanlab Python package to get outlier scores.Importing Libraries and Setting SeedsThe following block of code is essentially the initial setup phase of our data processing and analysis script. Here, we import all the necessary libraries and packages that will be used throughout the code. First, we need to install some of the necessary libraries:!pip install cleanlab datasets hdbscan nltk matplotlib numpy torch transformers umap-learnIt is highly recommended to use Google Colab with GPUs or TPUs to be able to create the embeddings in a proper amount of time.Now we can start with the importing of the sentences:import cleanlab import datasets import hdbscan import nltk import matplotlib.pyplot as plt import numpy as np import re import torch from cleanlab.outlier import OutOfDistribution from datasets import load_dataset, concatenate_datasets from IPython.display import display from sklearn.metrics import precision_recall_curve from torch.utils.data import DataLoader from tqdm.auto import tqdm from transformers import AutoTokenizer, AutoModel from umap import UMAP nltk.download('stopwords') datasets.logging.set_verbosity_error() torch.backends.cudnn.deterministic = True torch.backends.cudnn.benchmark = False torch.cuda.manual_seed_all(SEED)Here's what each imported library/package does:cleanlab: A package used for finding label errors in datasets and learning with noisy labels.datasets: Provides easy-to-use, high-level APIs for downloading and preparing datasets for modeling.hdbscan: A clustering algorithm that combines the benefits of hierarchical clustering and density-based spatial clustering of applications with noise (DBSCAN).nltk: Short for Natural Language Toolkit, a leading platform for building Python programs to work with human language data.torch: PyTorch is an open-source machine learning library based on the Torch library, used for applications such as natural language processing.This part of the code also downloads the NLTK (Natural Language Toolkit) stopwords. Stopwords are words like 'a', 'an', and 'the', which are not typically useful for modeling and are often removed during pre-processing. The datasets.logging.set_verbosity_error() sets the logging level to error. This means that only the messages with the level error or above will be displayed.The code also sets some additional properties for CUDA operations (if a CUDA-compatible GPU is available), which can help ensure consistency across different executions of the code.Dataset Preprocessing and LoadingThe following block of code represents the next major phase: preprocessing and loading the datasets. This is where we clean and prepare our data so that it can be fed into our LLM models:def preprocess_datasets(    *datasets,    sample_sizes = [5000, 450, 450],    columns_to_remove = ['premise_binary_parse', 'premise_parse', 'hypothesis_binary_parse', 'hypothesis_parse', 'promptID', 'pairID', 'label'], ):    # Remove -1 labels (no gold label)    f = lambda ex: ex["label"] != -1    datasets = [dataset.filter(f) for dataset in datasets]    # Sample a subset of the data    assert len(sample_sizes) == len(datasets), "Number of datasets and sample sizes must match"    datasets = [        dataset.shuffle(seed=SEED).select([idx for idx in range(sample_size)])        for dataset, sample_size in zip(datasets, sample_sizes)    ]    # Remove columns    datasets = [data.remove_columns(columns_to_remove) for data in datasets]    return datasetsThis is a function definition for preprocess_datasets, which takes any number of datasets (with their sample sizes and columns to be removed specified as lists). The function does three main things:Filtering: Removes examples where the label is -1. A label of -1 means that there is no gold label for that example.Sampling: Shuffles the datasets and selects a specific number of examples based on the provided sample_sizes.Removing columns: Drops specific columns from the dataset as per the columns_to_remove list.train_data = load_dataset("multi_nli", split="train") val_matched_data = load_dataset("multi_nli", split="validation_matched") val_mismatched_data = load_dataset("multi_nli", split="validation_mismatched") train_data, val_matched_data, val_mismatched_data = preprocess_datasets(    train_data, val_matched_data, val_mismatched_data )The above lines load the train and validation datasets from multi_nli (a multi-genre natural language inference corpus) and then preprocess them using the function we just defined.Finally, we print the genres available in each dataset and display the first few records using the Pandas data frame. This is useful to confirm that our datasets have been loaded and preprocessed correctly:print("Training data") print(f"Genres: {np.unique(train_data['genre'])}") display(train_data.to_pandas().head()) print("Validation matched data") print(f"Genres: {np.unique(val_matched_data['genre'])}") display(val_matched_data.to_pandas().head()) print("Validation mismatched data") print(f"Genres: {np.unique(val_mismatched_data['genre'])}") display(val_mismatched_data.to_pandas().head())With the help of this block, we have our datasets loaded and preprocessed, ready to be transformed into vector embeddings.Sentence Embedding and TransformationNow, we proceed to the next crucial step, transforming our textual data into numerical vectors. This is where text or sentence embeddings come into play.In simple terms, sentence embeddings are the numerical representations of sentences. Just as words can be represented by dense vectors (a process known as word embeddings), entire sentences can also be encoded into vectors. This transformation process facilitates mathematical operations on text, making it possible for machine learning algorithms to perform tasks like text classification, sentence similarity, sentiment analysis, and more.To produce high-quality sentence embeddings, the context of each word in the sentence and the semantics should be considered. Transformer-based models, like BERT, DistilBERT, or RoBERTa, are very effective in creating these contextual sentence embeddings.Now, let's explain the next block of code:#Mean Pooling - Take attention mask into account for correct averaging def mean_pooling(model_output, attention_mask):    token_embeddings = model_output[0]    input_mask_expanded = attention_mask.unsqueeze(-1).expand(token_embeddings.size()).float()    return torch.sum(token_embeddings * input_mask_expanded, 1) / torch.clamp(input_mask_expanded.sum(1), min=1e-9)This function mean_pooling is used to calculate the mean of all token embeddings that belong to a single sentence. The function receives the model_output (containing the token embeddings) and an attention_mask (indicating where actual tokens are and where padding tokens are in the sentence). The mask is used to correctly compute the average over the length of each sentence, ignoring the padding tokens.The function embed_sentence_pairs processes the sentence pairs, creates their embeddings, and stores them. It uses a data loader (which loads data in batches), a tokenizer (to convert sentences into model-understandable format), and a pre-trained language model (to create the embeddings).The function is a vital part of the sentence embedding process. This function uses a language model to convert pairs of sentences into high-dimensional vectors that represent their combined semantics. Here's an annotated walkthrough:def embed_sentence_pairs(dataloader, tokenizer, model, disable_tqdm=False):    # Empty lists are created to store the embeddings of premises and hypotheses    premise_embeddings  = []    hypothesis_embeddings = []    feature_embeddings = []    # The device (CPU or GPU) to be used for computations is determined    device = torch.device("cuda") if torch.cuda.is_available() else torch.device("cpu")    # The model is moved to the chosen device and set to evaluation mode    model.to(device)    model.eval()    # A loop is set up to iterate over the data in the dataloader    loop = tqdm(dataloader, desc=f"Embedding sentences...", disable=disable_tqdm)    for data in loop:        # The premise and hypothesis sentences are extracted from the data       premise, hypothesis = data['premise'], data['hypothesis']        # The premise and hypothesis sentences are encoded into a format that the model can understand        encoded_premise, encoded_hypothesis = (            tokenizer(sentences, padding=True, truncation=True, return_tensors='pt')            for sentences in (premise, hypothesis)        )        # The model computes token embeddings for the encoded sentences        with torch.no_grad():            encoded_premise = encoded_premise.to(device)            encoded_hypothesis = encoded_hypothesis.to(device)            model_premise_output = model(**encoded_premise)            model_hypothesis_output = model(**encoded_hypothesis)        # Mean pooling is performed on the token embeddings to create sentence embeddings        pooled_premise = mean_pooling(model_premise_output, encoded_premise['attention_mask']).cpu().numpy()        pooled_hypothesis = mean_pooling(model_hypothesis_output, encoded_hypothesis['attention_mask']).cpu().numpy()        # The sentence embeddings are added to the corresponding lists        premise_embeddings.extend(pooled_premise)        hypothesis_embeddings.extend(pooled_hypothesis)    # The embeddings of the premises and hypotheses are concatenated along with their absolute difference    feature_embeddings = np.concatenate(        [            np.array(premise_embeddings),            np.array(hypothesis_embeddings),            np.abs(np.array(premise_embeddings) - np.array(hypothesis_embeddings))        ],        axis=1    )    return feature_embeddingsThis function does all the heavy lifting of turning raw textual data into dense vectors that machine learning algorithms can use. It takes in a dataloader, which feeds batches of sentence pairs into the function, a tokenizer to prepare the input for the language model, and the model itself to create the embeddings.The embedding process involves first tokenizing each sentence pair and then feeding the tokenized sentences into the language model. This yields a sequence of token embeddings for each sentence. To reduce these sequences to a single vector per sentence, we apply a mean pooling operation, which takes the mean of all token vectors in a sentence, weighted by their attention masks.Finally, the function concatenates the embeddings of the premise and hypothesis of each pair, along with the absolute difference between these two embeddings. This results in a single vector that represents both the individual meanings of the sentences and the semantic relationship between them. The absolute difference between the premise and hypothesis embeddings helps to capture the semantic contrast in the sentence pair.These concatenated embeddings, returned by the function, serve as the final input features for further machine-learning tasks.The function begins by setting the device to GPU if it's available. It sets the model to evaluation mode using model.eval(). Then, it loops over the data loader, retrieving batches of sentence pairs.For each sentence pair, it tokenizes the premise and hypothesis using the provided tokenizer. The tokenized sentences are then passed to the model to generate the model outputs. Using these outputs, mean pooling is performed to generate sentence-level embeddings.Finally, the premise and hypothesis embeddings are concatenated along with their absolute difference, resulting in our final sentence pair embeddings. These combined embeddings capture the information from both sentences and the relational information between them, which are stored in feature_embeddings.These feature embeddings are critical and are used as input features for the downstream tasks. Their high-dimensional nature contains valuable semantic information which can help in various NLP tasks such as text classification, information extraction, and more.Sentence Embedding and TokenizingThis block of code takes care of model loading, data preparation, and finally, the embedding process for each sentence pair in our datasets. Here's an annotated walkthrough:# Pretrained SentenceTransformers handle this task better than regular Transformers model_name = 'sentence-transformers/all-MiniLM-L6-v2' # Uncomment the following line to try a regular Transformers model trained on MultiNLI # model_name = 'sileod/roberta-base-mnli' # Instantiate the tokenizer and model from the pretrained transformers on the Hugging Face Hub tokenizer = AutoTokenizer.from_pretrained(model_name) model = AutoModel.from_pretrained(model_name) batch_size = 128 # Prepare the PyTorch DataLoaders for each of the train, validation matched, and validation mismatched datasets trainloader = DataLoader(train_data, batch_size=batch_size, shuffle=False) valmatchedloader = DataLoader(val_matched_data, batch_size=batch_size, shuffle=False) valmismatchedloader = DataLoader(val_mismatched_data, batch_size=batch_size, shuffle=False) # Use the embed_sentence_pairs function to create embeddings for each dataset train_embeddings = embed_sentence_pairs(trainloader, tokenizer, model, disable_tqdm=True) val_matched_embeddings = embed_sentence_pairs(valmatchedloader, tokenizer, model, disable_tqdm=True) val_mismatched_embeddings = embed_sentence_pairs(valmismatchedloader, tokenizer, model, disable_tqdm=True)This block begins by setting the model_name variable to the identifier of a pretrained SentenceTransformers model available on the Hugging Face Model Hub. SentenceTransformers are transformer-based models specifically trained for generating sentence embeddings, so they are generally more suitable for this task than regular transformer models. The MiniLM model was chosen for its relatively small size and fast inference times, but provides performance comparable to much larger models. If you wish to experiment with a different model, you can simply change the identifier.Next, the tokenizer and model corresponding to the model_name are loaded using the from_pretrained method, which fetches the necessary components from the Hugging Face Model Hub and initializes them for use.The DataLoader utility from the PyTorch library is then used to wrap our Hugging Face datasets. The DataLoader handles the batching of the data and provides an iterable over the dataset, which will be used by our embed_sentence_pairs function. The batch size is set to 128, which means that the model processes 128 sentence pairs at a time.Finally, the embed_sentence_pairs function is called for each of our data loaders (train, validation matched, and validation mismatched), returning the corresponding embeddings for each sentence pair in these datasets. These embeddings will be used as input features for our downstream tasks.Outlier Detection in DatasetsIn the realm of machine learning, outliers often pose a significant challenge. These unusual or extreme values can cause the model to make erroneous decisions based on data points that don't represent the general trend or norm in the data. Therefore, an essential step in data preprocessing for machine learning is identifying and handling these outliers effectively.In our project, we make use of the OutOfDistribution object from the cleanlab Python package to conduct outlier detection. The OutOfDistribution method computes an outlier score for each data point based on how well it fits within the overall distribution of the data. The higher the outlier score, the more anomalous the data point is considered to be.Let's take a detailed look at how this is achieved in the code:ood = OutOfDistribution() train_outlier_scores = ood.fit_score(features=train_embeddings)In the first step, we instantiate the OutOfDistribution object. Then, we fit this object to our training data embeddings and calculate outlier scores for each data point in the training data:top_train_outlier_idxs = (train_outlier_scores).argsort()[:15] top_train_outlier_subset = train_data.select(top_train_outlier_idxs) top_train_outlier_subset.to_pandas().head()Next, we select the top 15 training data points with the highest outlier scores. These data points are then displayed for manual inspection, helping us understand the nature of these outliers.We then apply a similar process to our validation data:test_feature_embeddings = np.concatenate([val_matched_embeddings, val_mismatched_embeddings], axis=0) test_outlier_scores = ood.score(features=test_feature_embeddings) test_data = concatenate_datasets([val_matched_data, val_mismatched_data])First, we concatenate the matched and mismatched validation embeddings. Then, we calculate the outlier scores for each data point in this combined validation dataset using the previously fitted OutOfDistribution object:top_outlier_idxs = (test_outlier_scores).argsort()[:20] top_outlier_subset = test_data.select(top_outlier_idxs) top_outlier_subset.to_pandas()Lastly, we identify the top 20 validation data points with the highest outlier scores. Similar to our approach with the training data, these potential outliers are selected and visualized for inspection.By conducting this outlier analysis, we gain valuable insights into our data. These insights can inform our decisions on data preprocessing steps, such as outlier removal or modification, to potentially enhance the performance of our machine learning model.Evaluating Outlier Scores and Setting a ThresholdOnce we have determined the outlier scores for each data point, the next step is to set a threshold for what we will consider an "outlier." While there are various statistical methods to determine this threshold, one simple and commonly used approach is to use percentiles.In this project, we choose to set the threshold at the 2.5th percentile of the outlier scores in the training data. This choice implies that we consider the bottom 2.5% of our data (in terms of their fit to the overall distribution) as outliers. Let's look at how this is implemented in the code:threshold = np.percentile(test_outlier_scores, 2.5)The code above calculates the 2.5th percentile of the outlier scores in the training data and sets this value as our threshold for outliers.Next, we visualize the distribution of outlier scores for both the training and test data:fig, axes = plt.subplots(nrows=1, ncols=2, figsize=(10, 5)) plt_range = [min(train_outlier_scores.min(),test_outlier_scores.min()), \\\\             max(train_outlier_scores.max(),test_outlier_scores.max())] axes[0].hist(train_outlier_scores, range=plt_range, bins=50) axes[0].set(title='train_outlier_scores distribution', ylabel='Frequency') axes[0].axvline(x=threshold, color='red', linewidth=2) axes[1].hist(test_outlier_scores, range=plt_range, bins=50) axes[1].set(title='test_outlier_scores distribution', ylabel='Frequency') axes[1].axvline(x=threshold, color='red', linewidth=2)In the histogram, the red vertical line represents the threshold value. By observing the distributions and where the threshold falls, we get a visual representation of what proportion of our data is considered "outlying.":Finally, we select the outliers from our test data based on this threshold:sorted_ids = test_outlier_scores.argsort() outlier_scores = test_outlier_scores[sorted_ids] outlier_ids = sorted_ids[outlier_scores < threshold] selected_outlier_subset = test_data.select(outlier_ids) selected_outlier_subset.to_pandas().tail(15)This piece of code arranges the outlier scores in ascending order, determines which data points fall below the threshold (hence are considered outliers), and selects these data points from our test data. The bottom 15 rows of this selected outlier subset are then displayed:By setting and applying this threshold, we can objectively identify and handle outliers in our data. This process helps improve the quality and reliability of our LLM models.ConclusionThis article focuses on detecting anomalies in multi-genre NLI datasets using advanced tools and techniques, from preprocessing with transformers to outlier detection. The MultiNLI dataset was streamlined using Hugging Face's datasets library, enhancing manageability. Exploring sentence embeddings, transformers library generated robust representations by averaging token embeddings with mean_pooling. Outliers were identified using cleanlab library and visualized via plots and tables, revealing data distribution and characteristics.A threshold was set based on the 2.5th percentile of outlier scores, aiding anomaly identification in the test dataset. The study showcases the potential of Large Language Models in NLP, offering efficient solutions to complex tasks. This exploration enriches dataset understanding and highlights LLM's impressive capabilities, underlining its impact on previously daunting challenges. The methods and libraries employed demonstrate the current LLM technology's prowess, providing potent solutions. By continuously advancing these approaches, NLP boundaries are pushed, paving the way for diverse research and applications in the future.Author Bio:Alan Bernardo Palacio is a data scientist and an engineer with vast experience in different engineering fields. His focus has been the development and application of state-of-the-art data products and algorithms in several industries. He has worked for companies such as Ernst and Young, Globant, and now holds a data engineer position at Ebiquity Media helping the company to create a scalable data pipeline. Alan graduated with a Mechanical Engineering degree from the National University of Tucuman in 2015, participated as the founder in startups, and later on earned a Master's degree from the faculty of Mathematics in the Autonomous University of Barcelona in 2017. Originally from Argentina, he now works and resides in the Netherlands.LinkedIn
Read more
  • 0
  • 0
  • 12868
article-image-falcon-llm-the-dark-horse-in-open-source-llm-race
Valentina Alto
07 Jun 2023
6 min read
Save for later

Falcon LLM: The Dark Horse in Open Source LLM Race

Valentina Alto
07 Jun 2023
6 min read
Discover the ground-breaking capabilities of the Falcon Language Model (LLM) in natural language processing. This article presents an architectural overview of Falcon LLM, highlighting its transformer-based design and distinctive features. Gain practical guidance on leveraging Falcon LLM's power effectively, including fine-tuning techniques and optimization strategies. We also address ethical considerations and responsible AI deployment. Whether you're a researcher, developer, or simply curious about cutting-edge language models, this article provides valuable insights to harness the full potential of Falcon LLM.Foundation models and LLMsWhen we talk about Generative AI models, we are talking about a new generation of deep learning models called Foundation models. Foundation models are pre-trained AI models that can be fine-tuned for specific tasks.Foundational Models In the specific case of ChatGPT and similar models, we talk about Large language models (LLMs), a subset of Foundation models specifically designed for natural language processing tasks. Models like GPT-4 are examples of LLMs that can generate human-like text, answer questions, translate languages, and more.LLMs are characterized by huge training sets and a number of parameters of the network. To make an example, GPT-3 has been trained on almost 500 billion tokens and has 175 billion parameters. However, models with such a high number of parameters are heavy, both in the training phase and inference phase. This also implies a high computational cost, being needed GPU-powered hardware, and a lot of training time. That’s why a new trend has emerged lately, that is the one of building lighter models (with fewer parameters) focusing rather on the quality of the training dataset.Introducing Falcon LLMOne of the latest models of this new trend is Falcon LLM, an open-source model launched by Abu Dhabi’s Technology Innovation Institute (TII) that as of now (June 2023) ranks 1 globally in the latest Hugging Face independent verification of open-source AI models: Open LLM Leaderboard — a Hugging Face Space by HuggingFaceH4Falcon LLM has been trained on 1 trillion tokens and has 40 billion parameters (even though it has also been released a lighter version with 7 billion parameters). So the question might be: how can a model with “only” 40 billion parameters perform so well? In fact, the answer is in the quality of the dataset.Falcon was developed using specialized tools and incorporates a unique data pipeline capable of extracting valuable content from web data. The pipeline was designed to extract high-quality content by employing extensive filtering and deduplication techniques.The resulting dataset, called RefinedWeb, has been released by TII under the Apache-2.0 license and can be found here →https://huggingface.co/datasets/tiiuae/falcon-refinedweb.Plus, the architecture of Falcon was meticulously fine-tuned for optimal performance and efficiency. By combining superior data quality with these optimizations, Falcon achieves remarkable performance while utilizing around 75% of the training compute budget of the GPT-3. Furthermore, it requires only a fifth of the computing resources during inference.A decoder-only (Falcon LLM) architectureFalcon LLM is a decoder-only model, but what does it mean?Source: https://arxiv.org/abs/1706.03762 The Encoder-Decoder architecture was the original transformer architecture introduced in the Attention Is All You Need (https://arxiv.org/abs/1706.03762) paper in 2017. We have the “encoder”, which has the task to represent the input into a lower-dimensional space; on the right-hand side, we have the “decoder”, which has the task to translate back to the original data format the lower-dimensional data provided by the encoder.While the original transformer architecture was made of both the components — encoder and decoder, in last years, AI labs and companies shifted towards a new architecture made of a decoder-only framework. To name one example, the OpenAI’s GPT-3 is made of a decoder-only architecture.The key distinction between the Decoder-only architecture and the Encoder-Decoder architecture lies in the absence of a separate encoder responsible for summarizing the input information. Instead, in the Decoder-only architecture, the decoder’s hidden state implicitly encodes the relevant information and is continually updated at each step of the generation process.How to use Falcon LLMAs it is an open-source model, you can try Falcon LLM directly from the frontend provided on the Hugging Face site:Hugging face frontendPlus, you can download the model using Python:!pip install torch from transformers import AutoTokenizer, AutoModelForCausalLM import transformers import torch #model = "tiiuae/falcon-40b" model = "tiiuae/falcon-7b" tokenizer = AutoTokenizer.from_pretrained(model) pipeline = transformers.pipeline(    "text-generation",    model=model,    tokenizer=tokenizer,    torch_dtype=torch.bfloat16,    trust_remote_code=True,    device_map="auto", ) sequences = pipeline(   "Girafatron is obsessed with giraffes, the most glorious animal on the face of this Earth. Giraftron believes all other animals are irrelevant when compared to the glorious majesty of the giraffe.\nDaniel: Hello, Girafatron!\nGirafatron:",    max_length=200,    do_sample=True,    top_k=10,    num_return_sequences=1,    eos_token_id=tokenizer.eos_token_id, ) for seq in sequences:    print(f"Result: {seq['generated_text']}")Depending on your hardware capacity, you can decide to use either the 40b or the 7b parameters model. Also, note that the 7b version of the model is trained in English and French only.ConclusionsLLMs are extremely powerful, and they have seen an exponential growth in their number of parameters in the last few years. Nevertheless, we are quickly approaching towards a hard cap that is the computational capacity needed. Henceforth, it is pivotal to start exploring new ways of making LLMs less “large” yet more accurate, as TII is achieving with Falcon LLM. This implies a major focus on the quality of the training set, which massively impacts on the performance of the model.Falcon LLM paper will be released soon, so stay tuned to learn more about this amazing model!Referenceshttps://huggingface.co/datasets/tiiuae/falcon-refinedwebhttps://falconllm.tii.ae/Open LLM Leaderboard — a Hugging Face Space by HuggingFaceH4Author BioValentina Alto graduated in 2021 in data science. Since 2020, she has been working at Microsoft as an Azure solution specialist, and since 2022, she has been focusing on data and AI workloads within the manufacturing and pharmaceutical industries. She has been working closely with system integrators on customer projects to deploy cloud architecture with a focus on modern data platforms, data mesh frameworks, IoT and real-time analytics, Azure Machine Learning, Azure Cognitive Services (including Azure OpenAI Service), and Power BI for dashboarding. Since commencing her academic journey, she has been writing tech articles on statistics, machine learning, deep learning, and AI in various publications and has authored a book on the fundamentals of machine learning with Python.Valentina is also the author of the book: Modern Generative AI with ChatGPT and OpenAI ModelsLinks - Medium LinkedIn  
Read more
  • 0
  • 0
  • 12689

article-image-debugging-and-monitoring-llms-with-weights-biases
Mostafa Ibrahim
31 Oct 2023
6 min read
Save for later

Debugging and Monitoring LLMs With Weights & Biases

Mostafa Ibrahim
31 Oct 2023
6 min read
Dive deeper into the world of AI innovation and stay ahead of the AI curve! Subscribe to our AI_Distilled newsletter for the latest insights. Don't miss out – sign up today!IntroductionLarge Language Models, or LLMs for short, are becoming a big deal in the world of technology. They're powerful and can do a lot, but they're not always easy to handle. Just like when building a big tower, you want to make sure everything goes right from the start to the finish. That's where Weights & Biases, often called W&B, comes in. It's a tool that helps people keep an eye on how their models are doing. In this article, we'll talk about why it's so important to watch over LLMs, how W&B helps with that, and how to use it. Let's dive in!Large Language Models (LLMs)Large Language Models (LLMs) are machine learning models trained on vast amounts of text data to understand and generate human-like text. They excel in processing and producing language, enabling various applications like translation, summarization, and conversation.LLMs, such as GPT-3 by OpenAI, utilize deep learning architectures to learn patterns and relationships in the data, making them capable of sophisticated language tasks. Through training on diverse datasets, they aim to comprehend context, semantics, and nuances akin to human communication.When discussing the forefront of natural language processing, several Large Language Models (LLMs) consistently emerge: The Need for Debugging & Monitoring LLMsUnderstanding and overseeing Large Language Models (LLMs) is much like supervising an intricate machine: they're powerful, and versatile, but require keen oversight.Firstly, think about the intricacy of LLMs. They far surpass the complexity of your typical day-to-day machine learning models. While they hold immense potential to revolutionize tasks involving language - think customer support, content creation, and translations - their intricate designs can sometimes misfire. If we're not careful, instead of a smooth conversation with a chatbot, users might encounter bewildering responses, leading to user frustration and diminished trust.Then there's the matter of resources. Training LLMs isn't just about the time; it's also financially demanding. Each hiccup, if not caught early, can translate to unnecessary expenditures. It's much like constructing a skyscraper; mid-way errors are costlier to rectify than those identified in the blueprint phase.Introduction to Weights & BiasesSourceWeights & Biases (W&B) is a cutting-edge platform tailored for machine learning practitioners. It offers a suite of tools designed to help streamline the model development process, from tracking experiments to visualizing results.With W&B, researchers and developers can efficiently monitor their LLM training progress, compare different model versions, and collaborate with team members. It's an invaluable asset for anyone looking to optimize and scale their machine-learning workflows.How to Use W&B for Debugging & Monitoring LLMsIn the hands-on section of this article, we will adhere to the following structured approach, illustrated in the diagram below. We will fine-tune our model and leverage Weights and biases to save critical metrics, tables, and visualizations. This will empower us with deeper insights, enabling efficient debugging and monitoring of our Large Language Models. 1. Setting up Weights and Biasesa. Importing Necessary Librariesimport torch import wandb from transformers import BertTokenizer, BertForSequenceClassification from torch.utils.data import DataLoader, random_split from datasets import load_datasetIntizailaizing W&B # Initialize W&B wandb.init(project='llm_monitoring', name='bert_example')b. Loading the BERT Model# Load tokenizer and model tokenizer = BertTokenizer.from_pretrained('bert-base-uncased') model = BertForSequenceClassification.from_pretrained('bert-base-uncased')2. Fine-tuning your Modela. Loading your datasetdataset = load_dataset('Load your dataset')b. Fine-tuning the modelfor epoch in range(config.epochs):    model.train()    for batch in train_dataloader:       # ……….       # Continue training process here       # ………..3. Tracking Metrics# Log the validation metrics to W&B    wandb.log({        "Epoch": epoch,        "Validation Loss": avg_val_loss,        "Validation Accuracy": val_accuracy    })4. Graph Visualizationsa. Plotting and logging Training Loss Graphfig, ax = plt.subplots(figsize=(10,5)) ax.plot(train_losses, label="Training Loss", color='blue') ax.set(title="Training Losses", xlabel="Epoch", ylabel="Loss") wandb.log({"Training Loss Curve": wandb.Image(fig)})b. Plotting and logging Validation Loss Graphfig, ax = plt.subplots(figsize=(10,5)) ax.plot(val_losses, label="Validation Loss", color='orange') ax.set(title="Validation Losses", xlabel="Epoch", ylabel="Loss") wandb.log({"Validation Loss Curve": wandb.Image(fig)})c. Plotting and Log Validation Accuracy Graphfig, ax = plt.subplots(figsize=(10,5)) ax.plot(val_accuracies, label="Validation Accuracy", color='green') ax.set(title="Validation Accuracies", xlabel="Epoch", ylabel="Accuracy") wandb.log({"Validation Accuracy Curve": wandb.Image(fig)})d. Plotting and Log Training Accuracy Graphfig, ax = plt.subplots(figsize=(10,5)) ax.plot(train_accuracies, label="Training Accuracy", color='blue') ax.set(title="Training Accuracies", xlabel="Epoch", ylabel="Accuracy") wandb.log({"Training Accuracy Curve": wandb.Image(fig)})5. Manual Checkupsquestions = ["What's the weather like?", "Who won the world cup?", "How do you make an omelette?", "Why is the sky blue?", "When is the next holiday?"] old_model_responses = ["It's sunny.", "France won the last one.", "Mix eggs and fry them.", "Because of the atmosphere.", "It's on December 25th."] new_model_responses = ["The weather is clear and sunny.", "Brazil was the champion in the previous world cup.", "Whisk the eggs, add fillings, and cook in a pan.", "Due to Rayleigh scattering.", "The upcoming holiday is on New Year's Eve."] # Create a W&B Table table = wandb.Table(columns=["question", "old_model_response", "new_model_response"]) for q, old, new in zip(questions, old_model_responses, new_model_responses):    table.add_data(q, old, new) # Log the table to W&B wandb.log({"NLP Responses Comparison": table}) 6. Closing the W&B run after all logs are uploadedwandb.finish()ConclusionLarge Language Models have truly transformed the landscape of technology. Their vast capabilities are nothing short of amazing, but like all powerful tools, they require understanding and attention. Fortunately, with platforms like Weights & Biases, we have a handy toolkit to guide us. It reminds us that while LLMs are game-changers, they still need a bit of oversight.Author BioMostafa Ibrahim is a dedicated software engineer based in London, where he works in the dynamic field of Fintech. His professional journey is driven by a passion for cutting-edge technologies, particularly in the realms of machine learning and bioinformatics. When he's not immersed in coding or data analysis, Mostafa loves to travel.Medium
Read more
  • 0
  • 0
  • 12622

article-image-set-up-and-run-auto-gpt-with-docker
Rohan Chikorde
04 Jun 2023
8 min read
Save for later

Set Up and Run Auto-GPT with Docker

Rohan Chikorde
04 Jun 2023
8 min read
Are you looking to get your hands dirty with Auto-GPT? Look no further! In this article, we'll guide you through the straightforward installation process, enabling you to effortlessly set up Auto-GPT and unlock its powerful capabilities. Say goodbye to complex setups and hello to enhanced language generation in just a few simple steps. To use Auto-GPT, users need to have Python installed on their computer, as well as an OpenAI API key. This key allows Auto-GPT to access the GPT-4 and GPT-3.5 APIs, as well as other resources such as internet search engines and popular websites. Once it is configured, users can interact with Auto-GPT using natural language commands, and the AI agent will automatically perform the requested task. We will show practically how to set up and run Auto-GPT using Docker. We will also be showing steps to other popular methods towards the end. Benefits of using Docker for running Auto-GPT  Docker is a containerization technology that allows developers to create, deploy, and run applications in a consistent and isolated environment. It enables the packaging of an application and all its dependencies into a single container, which can be easily distributed and run on any machine that has Docker installed. Using Docker to run Auto-GPT provides several benefits:It allows you to run Auto-GPT in an isolated and reproducible environment, which ensures that the dependencies and configurations required to run Auto-GPT are consistent across different machines. This can be especially useful when collaborating on a project or when deploying Auto-GPT to a production environment. Docker provides a secure sandboxed environment, which can help prevent any potential harm to your computer from continuous mode malfunctions or accidental damage from commands.  Docker simplifies the installation and configuration process of Auto-GPT by packaging it in a container that includes all the necessary dependencies and libraries. This means you don't have to manually install and configure these dependencies, which can be time-consuming and error prone. Overall, using Docker to run Auto-GPT provides a convenient and secure solution for developing and deploying Auto-GPT in a consistent and reproducible manner.Software Requirements Docker (recommended)  Python 3.10 or later  VSCode + devcontainer Getting an API key  Get your OpenAI API key from: https://platform.openai.com/account/api-keys   Fig 1. Creating API keySetting up Auto-GPT with DockerHere first we will showcase step by step by guide to set up Auto-GPT using docker.1.     Make sure you have Python and Docker are installed on your system and its daemon is running, see requirements Fig 2. Command Prompt  2.     Open CMD and Pull the latest image from Docker Hub using following command:docker pull significantgravitas/auto-gpt Fig 3. Pulling image from dockerhub Please note if docker daemon is not running it will throw an error. Fig 4. Docker Image Once pulled using above command, you can find the significantgravitas/auto-gpt image on your docker. 3.     Create a folder for Auto-GPT4.     In the folder, create a file named docker-compose.yml with the following contents:version: "3.9"services:  auto-gpt:    image: significantgravitas/auto-gpt    depends_on:      - redis    env_file:      - .env    environment:      MEMORY_BACKEND: ${MEMORY_BACKEND:-redis}      REDIS_HOST: ${REDIS_HOST:-redis}    profiles: ["exclude-from-up"]    volumes:      - ./auto_gpt_workspace:/app/auto_gpt_workspace      - ./data:/app/data      ## allow auto-gpt to write logs to disk      - ./logs:/app/logs      ## uncomment following lines if you have / want to make use of these files      #- ./azure.yaml:/app/azure.yaml      #- ./ai_settings.yaml:/app/ai_settings.yaml  redis:    image: "redis/redis-stack-server:latest" 5.     Download Source code(zip) from the latest stable release6.     Extract the zip-file into a folder. Fig 5. Source folder Configuration using Docker 1.     After downloading and unzipping the folder, find the file named .env.template in the main Auto-GPT folder. This file may be hidden by default in some         operating systems due to the dot prefix. To reveal hidden files, follow the instructions for your specific operating system: Windows, macOS2.     Create a copy of .env.template and call it .env; if you're already in a command prompt/terminal window: use cp .env.template .env3.     Now you should have only two files in your folder – docker-compose.yml and .env Fig 6.  Docker-compose and .env files 4.     Open the .env file in a text editor5.     Find the line that says OPENAI_API_KEY=6.     After the =, enter your unique OpenAI API Key without any quotes or spaces.7.     Extracting API key is discussed in step 1 (discussed above).8.     Save and close .env file Running Auto-GPT with Docker Easiest is to use docker-compose. Run the commands below in your Auto-GPT folder.1.     Build the image. If you have pulled the image from Docker Hub, skip this stepdocker-compose build auto-gpt2.     Run Auto-GPTdocker-compose run --rm auto-gpt3.     By default, this will also start and attach a Redis memory backend. If you do not want this, comment or remove the depends: - redis and redis: sections           from docker-compose.yml4.     You can pass extra arguments, e.g., running with --gpt3only and --continuous:docker-compose run --rm auto-gpt --gpt3only –continuous Fig 7. Auto-GPT Installed Other methods without Docker Setting up Auto-GPT with Git 1.     Make sure you have Git installed for your OS2.     To execute the given commands, open a CMD, Bash, or PowerShell window. On Windows: press Win+X and select Terminal, or Win+R and enter cmd3.     First clone the repository using following command:git clone -b stable https://github.com/Significant-Gravitas/Auto-GPT.git4.     Navigate to the directory where you downloaded the repositorycd Auto-GPT Manual Setup 1.     Download Source code (zip) from the latest stable release2.     Extract the zip-file into a folderConfiguration 1.     Find the file named .env.template in the main Auto-GPT folder. This file may be hidden by default in some operating systems due to the dot prefix. To reveal hidden files, follow the instructions for your specific operating system: Windows, macOS2.     Create a copy of .env.template and call it .env; if you're already in a command prompt/terminal window: cp .env.template .env3.     Open the .env file in a text editor4.     Find the line that says OPENAI_API_KEY=5.     After the =, enter your unique OpenAI API Key without any quotes or spaces6.     Save and close the .env file Run Auto-GPT without Docker Simply run the startup script in your terminal. This will install any necessary Python packages and launch Auto-GPT. Please note, if the above configuration is not properly setup, then it will throw an error, hence recommended and easiest way to run is using docker.On Linux/MacOS:./run.shOn Windows:.\run.batIf this gives errors, make sure you have a compatible Python version installed. ConclusionIn conclusion, if you're looking for a hassle-free way to install Auto-GPT, Docker is the recommended choice. By following our comprehensive guide, you can effortlessly set up Auto-GPT using Docker, ensuring a streamlined installation process, consistent environment configuration, and seamless deployment on different platforms. With Docker, bid farewell to compatibility concerns and embrace a straightforward and efficient Auto-GPT installation experience. Empower your language generation capabilities today with the power of Docker and Auto-GPT.Author BioRohan is an accomplished AI Architect professional with a post-graduate in Machine Learning and Artificial Intelligence. With almost a decade of experience, he has successfully developed deep learning and machine learning models for various business applications. Rohan's expertise spans multiple domains, and he excels in programming languages such as R and Python, as well as analytics techniques like regression analysis and data mining. In addition to his technical prowess, he is an effective communicator, mentor, and team leader. Rohan's passion lies in machine learning, deep learning, and computer vision.You can follow Rohan on LinkedIn
Read more
  • 0
  • 0
  • 12571
article-image-ai-distilled-24-google-invests-2-billion-in-anthropic-perplexitys-ai-search-engine-bidens-ai-executive-order-data-mining-with-gpt-4-rl-and-aws-deepracer
Merlyn Shelley
03 Nov 2023
13 min read
Save for later

AI_Distilled #24: Google Invests $2 Billion in Anthropic, Perplexity's AI Search Engine, Biden's AI Executive Order, Data Mining with GPT-4, RL and AWS Deepracer

Merlyn Shelley
03 Nov 2023
13 min read
👋 Hello ,Welcome to another captivating edition of AI_Distilled, featuring recent advancements in training and fine-tuning LLMs, GPT and AI models for enhanced business outcomes.Let’s begin our news and analysis with an industry expert’s opinion.  “Artificial intelligence is the science of making machines do things that would require intelligence if done by humans” – John McCarthy, Computer Scientist and AI Visionary. AI does indeed make machines intelligent, so much so that industry titans are now waging a proxy AI war with billions in startup funding. Without a doubt, AI is onto something big! In this week, we’ll talk about Biden's AI Executive Order, which has been praised for scope but deemed insufficient without legislation, Perplexity's AI Search Engine, OpenAI launching new team and challenge to prepare for catastrophic risks of advanced AI, Google Invests $2 Billion in Anthropic, and updating its Bug Bounty program to address AI security concerns. Look out for your fresh dose of AI resources, secret knowledge, and tutorials on how to use custom AI models to enhance complex technical workflows, improving LLM understanding with user feedback, and essential text preprocessing for effective machine learning with Python. 📥 Feedback on the Weekly EditionWhat do you think of this issue and our newsletter?Please consider taking the short survey below to share your thoughts and you will get a free PDF of the “The Applied Artificial Intelligence Workshop” eBook upon completion. Complete the Survey. Get a Packt eBook for Free!Writer’s Credit: Special shout-out to Vidhu Jain for their valuable contribution to this week’s newsletter content!  Cheers,  Merlyn Shelley  Editor-in-Chief, Packt  SignUp | Advertise | Archives⚡ TechWave: AI/GPT News & Analysis🔹 OpenAI Launches New Team and Challenge to Prepare for Catastrophic Risks of Advanced AI: The ChatGPT creator announced new efforts to prepare for potential catastrophic risks associated with highly advanced AI systems. The company is forming a new internal team called "Preparedness" to assess risks ranging from cybersecurity threats to autonomous biological replication. It is also launching an "AI Preparedness Challenge" with prize money to crowdsource ideas for preventing misuse of advanced AI. OpenAI says it aims to benefit humanity with cutting-edge AI while taking seriously the full spectrum of safety risks.🔹 Biden's AI Executive Order Praised for Scope but Deemed Insufficient Without Legislation: President Biden recently issued an executive order on AI that experts say covers important ground but lacks teeth without accompanying legislation from Congress. The order establishes guidelines and oversight for AI development and use, including in healthcare. However, many provisions simply codify voluntary industry practices. Stakeholders say Congress must pass more comprehensive AI regulations, but partisan disputes make near-term action unlikely.  🔹 Google Updates Bug Bounty Program to Address AI Security Concerns: Google has expanded its vulnerability rewards program to include incentives for discovering potential abuses of artificial intelligence systems. The update comes as worries grow over generative AI being exploited maliciously. Under the revised guidelines, security researchers can earn financial rewards for uncovering AI training data extraction that leaks private information. The move aligns with AI companies' recent White House pledge to better identify AI vulnerabilities.  🔹 Perplexity's AI Search Engine Garners $500M Valuation After New Funding: The AI startup Perplexity recently secured additional funding led by venture capital firm IVP, garnering a $500 million valuation. Perplexity is developing a conversational search engine to challenge Google's dominance using artificial intelligence. The company's iOS app and website traffic have been growing steadily amid rising interest in AI like ChatGPT. With deep ties to Google researchers, Perplexity leverages LLMs and has attracted investments from major industry figures.  🔹 Tech Giants Wage Proxy AI War with Billions in Startup Funding As Google Invests $2 Billion in Anthropic: Major technology companies like Google, Microsoft, and Amazon are investing billions in AI startups like OpenAI and Anthropic as surrogates in the race to lead the AI space. Unable to quickly build their own capabilities in large language models, the tech giants are funneling massive sums into the AI leaders to gain ownership stakes and technology access. Anthropic's $2 billion funding from Google follows similar multibillion investments from Microsoft and Amazon, fueling an expensive AI innovation war by proxy.  🔹 Poe Unveils Monetization for Third-Party Conversational AI Developers: The AI chatbot platform Poe has introduced a new revenue sharing model to let creators’ profit from building specialized bots. Poe will split subscription fees and pay per-message charges to offset infrastructure costs. An open API also allows adding custom natural language models beyond Poe's defaults. The moves aim to spur innovation by empowering niche developers. Poe believes reducing barriers will increase diversity, not just competition.   🔮 Expert Insights from Packt Community Generative AI with Python and TensorFlow 2 - By Joseph Babcock , Raghav Bali  Kubeflow: an end-to-end machine learning lab As was described at the beginning of this chapter, there are many components of an end-to-end lab for machine learning research and development (Table 2.1), such as: A way to manage and version library dependencies, such as TensorFlow, and package them for a reproducible computing environment Interactive research environments where we can visualize data and experiment with different settings A systematic way to specify the steps of a pipeline – data processing, model tuning, evaluation, and deployment Provisioning of resources to run the modeling process in a distributed manner Robust mechanisms for snapshotting historical versions of the research process As we described earlier in this chapter, TensorFlow was designed to utilize distributed resources for training. To leverage this capability, we will use the Kubeflow projects. Built on top of Kubernetes, Kubeflow has several components that are useful in the end-to-end process of managing machine learning applications. Using Kubeflow Katib to optimize model hyperparameters Katib is a framework for running multiple instances of the same job with differing inputs, such as in neural architecture search (for determining the right number and size of layers in a neural network) and hyperparameter search (finding the right learning rate, for example, for an algorithm). Like the other Customize templates we have seen, the TensorFlow job specifies a generic TensorFlow job, with placeholders for the parameters: apiVersion: "kubeflow.org/v1alpha3" kind: Experiment metadata:  namespace: kubeflow  name: tfjob-example spec: parallelTrialCount: 3  maxTrialCount: 12  maxFailedTrialCount: 3  objective:    type: maximize    goal: 0.99    objectiveMetricName: accuracy_1  algorithm:    algorithmName: random  metricsCollectorSpec:    source:      fileSystemPath:        path: /train        kind: Directory    collector:      kind: TensorFlowEvent  parameters:    - name: --learning_rate      parameterType: double      feasibleSpace:        min: "0.01"        max: "0.05"    - name: --batch_size      parameterType: int      feasibleSpace:        min: "100"        max: "200"  trialTemplate:    goTemplate:        rawTemplate: |-          apiVersion: "kubeflow.org/v1"          kind: TFJob          metadata:            name: {{.Trial}}            namespace: {{.NameSpace}}          spec:           tfReplicaSpecs:            Worker:              replicas: 1               restartPolicy: OnFailure              template:                spec:                  containers:                    - name: tensorflow                       image: gcr.io/kubeflow-ci/tf-mnist-with-                             summaries:1.0                      imagePullPolicy: Always                      command:                        - "python"                        - "/var/tf_mnist/mnist_with_summaries.py"                        - "--log_dir=/train/metrics"                        {{- with .HyperParameters}}                        {{- range .}}                        - "{{.Name}}={{.Value}}"                        {{- end}}                        {{- end}}  which we can run using the familiar kubectl syntax: kubectl apply -fhttps://raw.githubusercontent.com/kubeflow/katib/master/examples/v1alpha3/tfjob-example.yaml This content is from the book “Generative AI with Python and TensorFlow 2” by Joseph Babcock , Raghav Bali (April 2021). Start reading a free chapter or access the entire Packt digital library free for 7 days by signing up now. To learn more, click on the button below. Read through the Chapter 1 unlocked here...  🌟 Secret Knowledge: AI/LLM Resources🔹 How to Use Custom AI Models to Enhance Complex Technical Workflows: In this post, you'll learn how Nvidia’s researchers leveraged customized LLMs to streamline intricate semiconductor chip design. The research demonstrates how to refine foundation models into customized assistants that understand industry-specific patterns. You'll see how careful data cleaning and selection enables high performance even with fewer parameters. The post explores step-by-step instructions on how researchers built a specialized AI that helps with writing code, improving documentation, and optimizing complex technical workflows.  🔹 How to Build Impactful LLM Applications: In this post, you'll explore lessons learned from creating Microsoft's Copilot products, such as Viva and PowerPoint. It discusses how combining LLMs with app context and other ML models can be a game-changer and demonstrates how parsing user queries and responses enables precise skill activation. By following their approach of utilizing multiple models to summarize insights without losing nuance, you can gain practical tips for your own LLM application development. 🔹 Understanding Convolutional Neural Networks and Vision Transformers: A Mathematical Perspective: You'll learn about convolutional neural networks and vision transformers in this post. They're great for image classification but differ in math, especially for generative tasks. You'll see how their training budgets work and understand their unique math. We'll also discuss their differences in complexity and memory usage. Plus, you'll learn why convolutional nets handle spatial coherence naturally, while vision transformers might need some help. By the end, you'll know why transformers are better for generating sequential data.  🔹 Improving Large Language Model Understanding with User Feedback: The post focuses on improving user intent detection for LLMs by utilizing disambiguation, context, and MemPrompt. These techniques enhance LLM responses, enabling better understanding of user intent, offering real-time feedback, and enhancing LLM performance and utility. 🔹 The Power of High-Quality Data in Language Models: The article emphasizes the significance of high-quality data for Large Language Models (LLMs). It introduces the concept of alignment, discussing how it influences LLM behavior. The article stresses the vital role of data quality and diversity in optimizing LLM performance and capabilities.  💡 Masterclass: AI/LLM Tutorials🔹 Enhance Language Model Performance with Step-Back Prompting: This guide explores the use of Step-Back Prompting to enhance LLMs' performance in complex tasks, like knowledge-intensive QA and multi-hop reasoning. It offers a step-by-step tutorial, including package setup and data collection, to implement this approach, potentially improving AI model behavior and responses.  🔹 Boosting AI at Scale with Vectorized Databases: This guide explores how vectorized databases are transforming LLMs like GPT-3 by enhancing their capabilities and scalability. It explains the principles of LLMs and the role of vectorized databases in empowering them. It discusses efficient data retrieval, optimization of vector operations, and scaling for real-time responses. The guide highlights use cases, including content generation and recommendation systems, where vectorized databases excel, and addresses the challenges of adopting them for LLMs. 🔹 Mastering Data Mining with GPT-4: A Practical Guide Using Seattle Weather Data: This guide explores the use of GPT-4 for data mining using Seattle's weather dataset. It covers AI's potential in data mining, detailing the process from exploratory data analysis to clustering and anomaly detection. GPT-4 assists in data loading, EDA, data cleaning, feature engineering, and suggests clustering methods. The post highlights the collaborative aspect of AI-human interaction and how GPT-4 can improve data mining and data analysis in the field of data science. 🔹 Introduction to Reinforcement Learning and AWS Deepracer: This post introduces reinforcement learning, a machine learning approach focused on maximizing rewards through agent-environment interactions. It compares it to motivating students based on performance. It explores practical applications via AWS Deepracer for self-driving cars, explaining key components and mentioning the Deepracer Student League as a learning opportunity.  🔹 Essential Text Preprocessing for Effective Machine Learning with Python: This post highlights crucial text preprocessing techniques for machine learning. It emphasizes the need to clean text data to avoid interference and unintended word distinctions. The methods, including removing numbers and handling extra spaces, enhance text data quality for effective machine learning applications.  🚀 HackHub: Trending AI Tools🔹 Pythagora-io/gpt-pilot: Boosts app development speed 20x via requirement specification, oversight, and coding assistance through clarifications and reviews. 🔹 hkuds/rlmrec: PyTorch implementation for the RLMRec model, enhancing recommenders with LLMs for advanced representation learning in recommendation systems. 🔹 THUDM/AgentTuning: Empowers LLMs by instruction-tuning them with interaction trajectories from various agent tasks, enhancing their generalization and language abilities. 🔹 cpacker/MemGPT: Enhances LLMs by intelligently managing memory tiers, enabling extended context and perpetual conversations.
Read more
  • 0
  • 0
  • 12417

article-image-how-open-source-language-models-could-reshape-the-tech-industry
Julian Melanson
30 Jun 2023
5 min read
Save for later

How Open-Source Language Models Could Reshape the Tech Industry

Julian Melanson
30 Jun 2023
5 min read
The world of technology, characterized by an incessant and rapid pace of evolution, is on the cusp of a seismic shift. Historically, the development and control of large language models—a key component in modern artificial intelligence systems—have been dominated by tech industry giants. However, emerging developments show that this might not be the status quo for much longer. The burgeoning field of open-source LLMs presents a potential disruption to the current balance of power in the tech industry, signaling a shift towards a more democratic and inclusive AI landscape.Major tech firms like Microsoft and Google, armed with vast financial resources, have long held the reins of the LLM market. Their position seemed unassailable as recent earnings calls indicated a thriving business built around their AI services. Yet, a leaked internal document from Google has cast a shadow of uncertainty over this seemingly secure stronghold. The central idea gleaned from this document? No company has an unassailable fortress against competition in the realm of LLMs, not even the mighty OpenAI, the organization responsible for the groundbreaking GPT-3.The story of GPT-3 is a pivotal chapter in the annals of AI history. Its 2020 release ignited a spark in the research community, illuminating the tantalizing promise of scale. With 175 billion parameters, GPT-3 showed capabilities that stretched beyond its initial training data. The success of this LLM prompted a surge of interest in the creation of larger, more complex models. This development led to an arms race among AI research labs, producing increasingly massive models such as Gopher, LaMDA, PaLM, and Megatron-Turing.However, this race towards larger LLMs engendered a substantial increase in research and development costs. The staggering financial demands associated with training and running models like GPT-3 created an environment where LLM innovation was essentially confined to the wealthiest entities in tech. With this economic pressure to recoup their considerable investment, these companies began to commercialize their technology, leading to the erection of protective "moats" around their products. These mechanisms of defensibility safeguarded their investments against the competition, obscuring their research and constraining the sharing of intellectual resources.Key elements of these moats included the proprietary control over training data, model weights, and the costs associated with training and inference. With their deep pockets, big tech companies kept the upper hand in managing the expenses tied to training and running large LLMs. This dominance rendered even open-source alternatives such as BLOOM and OPT175-B largely inaccessible to organizations without the fiscal means to support the hefty demands of these advanced models.The Coming of Open-Source Language ModelsFor a time, this state of affairs painted a bleak picture for the democratization of LLMs, with the field becoming increasingly exclusive and secretive. However, the ebb and flow of innovation and competition that define the tech industry were bound to respond. The open-source community rose to the challenge, their endeavors intensifying following the release of OpenAI's ChatGPT, an instruction-following language model that illustrated the vast potential of LLMs in a multitude of applications.These open-source alternatives are changing the game by proving that performance is not solely a function of scale. Small, nimble LLMs trained on expansive datasets have proven the ability to compete head-to-head with their larger counterparts. Moreover, the open-source models, often consisting of 7-13 billion parameters, can be fine-tuned to remarkable degrees on a modest budget and can run on consumer-grade GPUs.One such example, the open-source LLM developed by Meta, known as LLaMA, sparked a wave of similar models like Alpaca and Vicuna. These models, constructed on top of LLaMA, displayed an impressive capability for instruction-following akin to ChatGPT. The subsequent release of Dolly 2.0 by Databricks and Open Assistant further enriched the field by providing commercially usable, instruction-following LLMs that organizations can tailor to their specific needs.The impact of these open-source models is profound. They potentially democratize access to advanced AI systems, reducing the cost of training by using techniques like low-rank adaptation (LoRA) and allowing businesses to incorporate LLMs into their operations at an affordable price. This development poses a significant challenge to the established order, undermining the monopoly of tech giants on LLMs.Nonetheless, the rise of open-source models does not spell the end of cloud-based language models. Despite the democratization they promise, open-source LLMs face significant hurdles, including the prohibitive costs of pre-training. Furthermore, they may not be the best choice for all businesses. Companies without in-house machine learning expertise may still prefer the convenience of out-of-the-box, serverless solutions provided by the likes of Microsoft and Google. The entrenched distribution channels of these tech behemoths also present a formidable barrier for open-source LLMs to overcome.However, the broader implications of the open-source movement in LLMs are unmistakable. It expands the market, opens up novel applications, and puts pressure on tech giants to offer more competitive pricing. By democratizing access to advanced AI, it allows for broader participation in the AI revolution, reducing the concentration of power and innovation within a few wealthy tech companies. As the LLM landscape continues to evolve rapidly, the rise of open-source models will leave an indelible mark on the tech industry.Author BioJulian Melanson is one of the founders of Leap Year Learning. Leap Year Learning is a cutting-edge online school that specializes in teaching creative disciplines and integrating AI tools. We believe that creativity and AI are the keys to a successful future and our courses help equip students with the skills they need to succeed in a continuously evolving world. Our seasoned instructors bring real-world experience to the virtual classroom and our interactive lessons help students reinforce their learning with hands-on activities.No matter your background, from beginners to experts, hobbyists to professionals, Leap Year Learning is here to bring in the future of creativity, productivity, and learning!
Read more
  • 0
  • 0
  • 12271
Modal Close icon
Modal Close icon