AI Tools | 0 articles | Tech News, Tutorials & Expert Insights

article-image-chatgpt-on-your-hardware-gpt4all

20 Jun 2023

7 min read

ChatGPT on Your Hardware: GPT4All

20 Jun 2023

We all got familiar with conversational UI powered by Large Language Models: ChatGPT has been the first and most powerful example of how LLMs can boost our productivity daily. To be so accurate, LLMs are, by design, “large”, meaning that they are made of billions of parameters, hence they are hosted in powerful infrastructure, typically in the public cloud (namely, OpenAI’s models including ChatGPT are hosted in Microsoft Azure). As such, those models are accessible with an internet connection.But what if you could run those powerful models on your local PC, having a ChatGPT-like experience?Introducing GPT4AllGTP4All is an ecosystem to train and deploy powerful and customized large language models that run locally on consumer-grade CPUs – no GPU is required. A GPT4All model is a 3GB - 8GB file that you can download and plug into the GPT4All open-source ecosystem software, which is optimized to host models of size between 7 and 13 billion of parameters.To start working with the GPT4All Desktop app, you can download it from the official website hereàhttps://gpt4all.io/index.html.At the moment, there are 3 main LLMs families you can use within GPT4All:LlaMa - a collection of foundation language models ranging from 7B to 65B parameters. They were developed by Meta AI, Facebook’s parent company, and trained on trillions of tokens from 20 languages that use Latin or Cyrillic scripts. LLaMA can generate human-like conversations and outperforms GPT-3 on most benchmarks despite being 10x smaller. LLaMA is designed to run on less computing power and to be versatile and adaptable to many different use cases. However, LLaMA also faces challenges such as bias, toxicity, and hallucinations in large language models. Meta AI has released all the LLaMA models to the research community for open science.GPT-J- An open-source artificial intelligence language model developed by EleutherAI. It is a GPT-2-like causal language model trained on the Pile dataset, which is an open-source 825-gigabyte language modeling data set that is split into 22 smaller datasets. GPT-J comes in sizes ranging from 7B to 65B parameters and can generate creative text, solve mathematical theorems, predict protein structures, answer reading comprehension questions, and more. GPT-J performs very similarly to similarly-sized OpenAI’s GPT-3 versions on various zero-shot down-streaming tasks and can even outperform it on code generation tasks. GPT-J is designed to run on less computing power and to be versatile and adaptable to many different use cases. MPT - series of open source, commercially usable large language models developed by MosaicML. MPT-7B is a decoder-style transformer pretrained from scratch on 1T tokens of English text and code. MPT-7B is part of the family of MosaicPretrainedTransformer (MPT) models, which use a modified transformer architecture optimized for efficient training and inference. These architectural changes include performance-optimized layer implementations and the elimination of context length limits by replacing positional embeddings with Attention to Linear Biases (ALiBi). MPT-7B can handle highly long inputs thanks to ALiBi (up to 84k tokens vs. 2k-4k for other open-source models). MPT-7B also has several finetuned variants for different tasks, such as story writing, instruction following, and dialogue generation.You can download various versions of these models’ families directly within the GPT4All app:Image 1: Download section in GPT 4Image 2: Download PathIn my case, I downloaded a GPT-J with 6 billion parameters. You can select the model you want to use in the upper menu list in your application. Once selected the model, you can use it via the well-known ChatGPT UI, with the difference that it is now running on your local PC:Image 3: GPT4All responseAs you can see, the user experience is almost identical to the well-known ChatGPT, with the difference that we are running it locally and with different underlying LLMs.Chatting with your own dataAnother great thing about GPT4All is its integration with your local docs via plugins (currently in beta). To do so, you can go to settings (in the upper bar menu) and select LocalDocs Plugin. Here, you can browse the folder path you want to connect to and then “attach” it to the model knowledge base via the database icon in the upper right. In this example, I used the SQL licensing documentation in PDF format. Image 4: SQL documentationIn this case, the model is going to answer taking into consideration also (but not only) the attached documentation, which will be quoted in case the answer is based on it:Image 5: Automatically generated computer descriptionImage 6: Computer description with medium confidenceThe technique used to store and index the knowledge provided by our document is called Retrieval Augmented Generation (RAG), a type of language generation model that combines two types of memories:Pre-trained parametric memoryàthe one stored in the model’s parameters, derived from the training dataset;Non-parametric memoryàthe one derived from the attached knowledge provided, which is in the form of a vector database where the documents’ embeddings are stored.Finally, the LocalDocs plugin supports a variety of data formats including txt, docx, pdf, html and xlsx. For a comprehensive list of the supported formats, you can visit the page https://docs.gpt4all.io/gpt4all_chat.html#localdocs-capabilities.Using GPT4All with APIIn addition to the Desktop app mode, GPT4All comes with two additional ways of consumption, which are: Server mode- once enabled the server mode in the settings of the Desktop app, you can start using the API key of GPT4All at localhost 4891, embedding in your app the following code:import openai openai.api_base = "http://localhost:4891/v1" openai.api_key = "not needed for a local LLM" prompt = "What is AI?" # model = "gpt-3.5-turbo" #model = "mpt-7b-chat" model = "gpt4all-j-v1.3-groovy" # Make the API request response = openai.Completion.create( model=model, prompt=prompt, max_tokens=50, temperature=0.28, top_p=0.95, n=1, echo=True, stream=False )Python API-It comes with a lightweight SDK of GPT4All which you can easily install via pip install gpt4all. Here you can find a sample notebook on how to use this SDK: https://colab.research.google.com/drive/1QRFHV5lj1Kb7_tGZZGZ-E6BfX6izpeMI?usp=sharingConclusionsRunning LLMs on local hardware opens a new spectrum of possibilities, especially when we think about disconnected scenarios. Plus, having the possibility to chat with local documents with an easy-to-use interface adds the custom non-parametric memory to the model in such a way that we can use it already as a sort of copilot.Henceforth, even though it is in an initial phase, this ecosystem is paving the way for new interesting scenarios.Referenceshttps://docs.gpt4all.io/GPT4All Chat UI - GPT4All Documentation[2005.11401] Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks (arxiv.org)Author BioValentina Alto graduated in 2021 in Data Science. Since 2020 she has been working in Microsoft as Azure Solution Specialist and, since 2022, she focused on Data&AI workloads within the Manufacturing and Pharmaceutical industry. She has been working on customers’ projects closely with system integrators to deploy cloud architecture with a focus on datalake house and DWH, data integration and engineering, IoT and real-time analytics, Azure Machine Learning, Azure cognitive services (including Azure OpenAI Service), and PowerBI for dashboarding. She holds a BSc in Finance and an MSc degree in Data Science from Bocconi University, Milan, Italy. Since her academic journey, she has been writing Tech articles about Statistics, Machine Learning, Deep Learning, and AI in various publications. She has also written a book about the fundamentals of Machine Learning with Python. LinkedIn Medium

0
0
30009

article-image-ai-distilled-35-building-llms-rust-inference-speechgpt-gen-and-3d

Merlyn Shelley

02 Feb 2024

12 min read

AI Distilled 35: Building LLMs, Rust Inference, SpeechGPT-Gen, and 3D

Merlyn Shelley

02 Feb 2024

12 min read

Dive deeper into the world of AI innovation and stay ahead of the AI curve! Subscribe to our AI_Distilled newsletter for the latest insights. Don't miss out – sign up today!👋 Hello ,“[AI healthcare systems] include understanding symptoms and conditions better, including simplifying explanations in local vernaculars…and acting as a valuable second opinion. This, in turn, potentially provides a pathway for medical AI towards superhuman diagnostic performance.” - Vivek Natrajan, Google AI Researcher AI is making rapid strides in healthcare with a major disruption underway. The recent study examined how current language models perform in diagnostic settings and it’s clear they’re making tremendous progress with each passing day. Step into AI_Distilled's latest installment, where we delve into the most recent developments in AI/ML, LLMs, NLP, GPT, and Gen AI. Join us as we kick off this edition, showcasing news and insights from various sectors: AI Industry Updates: Apple Focuses on AI for Major iOS Update Tencent Chief Raises Concerns Over Gaming Business China Greenlights Over Four Dozen AI Systems New iPhone App Streamlines Web Searches with AI Assistance German Startup Aims to Supercharge AI Chips with Novel "Memcapacitor" Design AI Startup Anthropic Suffers Data Breach Amid Regulatory Scrutiny New AI Coding Tool Surpasses Previous Models New AI Model Launches: Meta Unleashes Mighty Code Llama 70B Google Unveils New AI Model for Advanced Video Generation OpenAI Announces Major Model and API Updates AI in Healthcare: AI Helps Design Proteins for Improved Gene Therapy Delivery AI's Diagnostic Prowess Advancing Rapidly AI in Supply Chain Management: AI Transforming Global Supply Chain Management Edge AI's Potential in Logistics Faces Memory Limitations We’ve also got you your fresh dose of LLM, GPT, and Gen AI secret knowledge and tutorials: Assessing AI Accuracy: New Leaderboard Evaluates Language Models New Techniques to Boost AI Performance Neural Networks Learn Non-Linear Functions Through Rectified Activation Enhancing Conversations with Contextual Understanding We know how much you love hands-on tips and strategies from the community, so here they are: Fine-Tuning BERT for Long-Form Text Training Giant Language Models the Cost-Effective Way Optimizing AI Models for Speed and Scale Estimating Depth from a Single Image: Researchers Evaluate AI Models Don’t forget to review these GitHub repositories that have been doing rounds: rasbt/LLMs-from-scratch LaurentMazare/mamba.rs 0nutation/speechgpt naver/roma 📥 Feedback on the Weekly EditionTake our weekly survey and get a free PDF copy of our best-selling book, "Interactive Data Visualization with Python - Second Edition." 📣 And here's the twist – we're tuning into YOUR frequency! Inspired by a reader's request, we're launching a column just for you. Got a burning question or a topic you're itching to dive into? Drop your suggestions in our content box – because your journey of discovery is our blueprint.We appreciate your input and hope you enjoy the book! Share your thoughts and opinions here! Writer’s Credit: Special shout-out to Vidhu Jain for their valuable contribution to this week’s newsletter content! Cheers, Merlyn Shelley Editor-in-Chief, Packt 🗝️ Unlock the Packt library for FREEDive into a world of endless knowledge with our 7-day FREE trial! Discover over 7,500 tech books and videos with your Packt subscription and stay ahead in your field.Plus, check out our ✨NEW feature: the AI Assistant (beta) ✨, available across eBook, print, and subscription formats. Don't miss your chance to explore and innovate – start your free trial today and unlock your tech potential! Ready to Crush It? Start Upskilling! SignUp | Advertise | Archives⚡ TechWave: AI/GPT News & Analysis💎 Apple Focuses on AI for Major iOS Update: Apple is planning a major iOS 18 update, aiming to enhance AI features, particularly Siri and messaging apps. It will include generative AI in various apps, with Apple in talks with publishers for content. New iPads and MacBooks with M3 chips may be released soon. 💎 Tencent Chief Raises Concerns Over Gaming Business: Tencent's CEO, Pony Ma, is concerned about competition in the gaming division as recent titles underperformed while rivals thrive. Despite 30% of revenue from games, Tencent feels left behind. They've caught up in AI and plan to integrate it across businesses, with a focus on evolving WeChat platform. 💎 China Greenlights Over Four Dozen AI Systems: Chinese regulators have recently granted approval to over 40 AI models for public use, with companies including Baidu, Alibaba, and ByteDance receiving the green light. This move reflects China's push to narrow the AI gap with the United States, driven by the success of ChatGPT. 💎 New iPhone App Streamlines Web Searches with AI Assistance: Arc Search is a mobile app improving search with "Browse for Me" creating concise web pages from multiple sources, AI summaries, tab switching, and reading mode. It's an early-stage tool designed to save users time in the digital era. 💎 German Startup Aims to Supercharge AI Chips with Novel "Memcapacitor" Design: Semron, a German startup, aims to revolutionize the computer chip industry by replacing transistors with "memcapacitors." Their 3D-stacked memcapacitor design promises improved AI performance and reduced energy consumption. With $10 million in funding, Semron aims to provide efficient chips for mobile and edge AI applications. 💎 AI Startup Anthropic Suffers Data Breach Amid Regulatory Scrutiny: Anthropic, the creator of LLM, disclosed an accidental data leak by a contractor to a third party. Concurrently, the FTC initiated an inquiry into Anthropic's affiliations with Amazon and Google, fueling concerns about data privacy in the expanding LLM landscape. 💎 New AI Coding Tool Surpasses Previous Models: AlphaCodium, an open-source AI model by CodiumAI, surpasses Google's AlphaCode and AlphaCode 2 in code generation efficiency. It employs a "flow engineering" approach, emphasizing code integrity through iterative code generation and adversarial testing, aiming to advance upon DeepMind's work and assist global developers. New AI Model Launches: 💎 Google Unveils New AI Model for Advanced Video Generation: Google has unveiled Lumiere, an advanced AI model utilizing space-time diffusion to create lifelike videos from text or images. It addresses motion and consistency problems in AI video generation, yielding 5-second clips with smooth motion. While not yet available for testing, experts anticipate Lumiere could redefine AI video creation capabilities. 💎 Meta Unleashes Mighty Code Llama 70B: Meta has launched Code Llama 70B, a powerful AI model for automating software development. With 500 billion training examples, it excels in translating natural language instructions into functional code, surpassing previous models. Meta aims to boost global coding and expand its use in translation, documentation, and debugging. 💎 OpenAI Announces Major Model and API Updates: OpenAI launched new AI models and tools, including enhanced embeddings, GPT-4, and moderation models. They lowered GPT-3.5 Turbo pricing, introduced key management features, and per-key analytics to enhance user control and accessibility, bolstering their technology's capabilities. AI in Healthcare: 💎 AI Helps Design Proteins for Improved Gene Therapy Delivery: University of Toronto scientists created ProteinVAE, an AI model to design unique protein variants, aiming to evade immune responses in gene therapy. By re-engineering adenovirus hexon proteins, they generate novel sequences for improved safety and efficacy, with faster, cost-effective design. Successful experiments could enhance gene therapy. 💎 AI's Diagnostic Prowess Advancing Rapidly: A study assessed GPT-3.5 and GPT-4's ability to mimic doctors' diagnostic reasoning. AI gave mostly accurate responses to various prompts, suggesting potential for assisting physicians, provided clinicians grasp the AI's response generation process. AI in Supply Chain Management: 💎 AI Transforming Global Supply Chain Management: A recent study revealed that 98% of executives consider AI crucial for enhancing supply chain operations. AI aids in cost reduction through inventory optimization, transportation, and trade expense management. It particularly benefits inventory management, a vital cost-cutting aspect, as businesses increasingly adopt AI and data-driven solutions to overcome ongoing supply chain challenges and achieve strategic objectives like cost reduction and revenue growth. 💎 Edge AI's Potential in Logistics Faces Memory Limitations: AI and edge computing offer potential for efficient supply chain management through real-time decision-making. However, data processing at the edge strains memory. MRAM tech may alleviate limitations. As data increases, novel storage solutions are essential for maximizing edge AI's logistics benefits, hinging on memory improvements. 🔮 Expert Insights from Packt Community Complete Python Course with 10 Real-World Projects [Video] - By Ardit Sulce This Python course benefits both beginners and experienced AI developers by thoroughly covering Python's versatility in supporting different programming paradigms. It begins with a concise introduction and explores fundamental to advanced Python techniques. For beginners, the initial 12 sections provide a strong foundation in Python basics. Experienced developers can sharpen their skills by exploring intermediate and advanced concepts such as OOPS, classes, lists, modules, functions, and JSON. Additionally, the course extends into practical application by teaching the use of essential libraries like Matplotlib and NumPy, web development with Flask, and even Android APK file manipulation. Database handling and geographical app development are also included, enriching the skill set of both novices and experts. The course culminates with the creation of ten practical applications, ranging from a volcano web map generator to data analysis dashboards, mobile apps, web scraping tools, and more. Ultimately, you will gain the ability to independently create executable Python programs, master coding syntax, and achieve comprehensive proficiency in Python programming, making this course highly recommended for students at all levels of experience. Plus, you can access project resources at: https://github.com/PacktPublishing/Complete-Python-Course-with-10-Real-World-Projects.Discover the "Complete Python Course with 10 Real-World Projects [Video]" by Ardit Sulce, published in February 2023. Get a comprehensive overview of the course content through a complete chapter preview video. Alternatively, access the entire Packt digital library, including this course and more, with a 7-day free trial. To access more insights and additional resources, click the button below.Watch Here 🌟 Secret Knowledge: AI/LLM Resources💎 Assessing AI Accuracy: New Leaderboard Evaluates Language Models: The Hallucinations Leaderboard assesses Language Models (LLMs) for generating incorrect information. It includes diverse tasks to test accuracy. Initial findings reveal variations in model performance, aiding in the pursuit of more reliable and less hallucination-prone LLMs. 💎 New Techniques to Boost AI Performance: Google researchers have created innovative software techniques for optimizing mixed-input matrix multiplication, a critical AI computation. They utilize register shuffling and efficient data type conversions to map mixed-input tasks onto hardware-accelerated Tensor Cores with minimal overhead. On NVIDIA GPUs, their approach matches the performance of hardware-native operations, offering potential solutions to computational challenges in advancing AI technology. 💎 Neural Networks Learn Non-Linear Functions Through Rectified Activation: The article explains how neural networks approximate complex functions, especially with Rectified Linear Unit (ReLU) activation. ReLU enables multiple neurons to represent linear and curved functions, emphasizing proper architecture over more layers for accuracy without overfitting, revealing neural networks' expressive capabilities. 💎 Enhancing Conversations with Contextual Understanding: The article explores enhancing conversational agents' contextual understanding in multi-turn conversations. It utilizes models like FLAN and Llama to assess question relationships, incorporates context from prior questions, and reevaluates responses to provide more comprehensive and successful conversation outcomes. These methods showed promising results in testing. 🔛 Masterclass: AI/LLM Tutorials💎 Fine-Tuning BERT for Long-Form Text: This article discusses how to use powerful NLP models like BERT to analyze long passages. BERT works by splitting lengthy reviews into chunks that fit its 512-token limit. It first pre-trains on general text, then fine-tunes on a task. To classify a long movie review, it is chopped into pieces, tagged with IDs, and stacked into the model. Results are pooled to get an overall sentiment. 💎 Training Giant Language Models the Cost-Effective Way: This guide explains how to enhance AI model training on AWS using Trainium and EKS. Set up a Kubernetes cluster with Trainium chips, preprocess data with Llama2's tokenizer, and optimize the model with compilation jobs. Monitor training across nodes, checking logs, utilization, and using Tensorboard for cost-effective training of large models. 💎 Optimizing AI Models for Speed and Scale: This blog advises on efficiently setting up and optimizing large language models for speed and scalability. It emphasizes hardware and configuration choices tailored to each model's requirements, balancing factors like response time and concurrent user capacity. Benchmarks demonstrate how GPU usage, model distribution, and request volume affect performance. Proper tuning enables AI assistants to engage in concurrent conversations with many users without excessive delays. 💎 Estimating Depth from a Single Image: Researchers Evaluate AI Models: Researchers evaluated neural networks' monocular depth estimation capabilities using single photos and RGB-depth map datasets. DPT and Marigold models were trained and compared, with DPT outperforming in accuracy based on RMSE and SSIM metrics. While promising, further enhancements are needed in this area. 🚀 HackHub: Trending AI Tools💎 rasbt/LLMs-from-scratch: Incrementally build a ChatGPT-like LLM from scratch for educational purposes.💎 LaurentMazare/mamba.rs: Pure Rust version of the Mamba inference algorithm with minimal dependencies for efficient sequence modeling. 💎 0nutation/speechgpt: Allows multi-modal conversations and SpeechGPT-Gen for scaling chain-of-information speech generation.💎 naver/roma: PyTorch library offering useful tools for working with 3D rotations, including conversions between rotation representations, metrics, interpolation, and differentiation capabilities.

0
0
23709

article-image-bi-pro49-microsoft-fabric-lifecycle-management-data-factory-adds-cicd-to-fabric-data-pipelines-database-mirroring-aws-well-architected-data-analytics-lens

Merlyn Shelley

04 Apr 2024

11 min read

BI-Pro#49: Microsoft Fabric Lifecycle Management, Data Factory Adds CI/CD to Fabric Data Pipelines, Database Mirroring, AWS Well-Architected Data Analytics Lens

Merlyn Shelley

04 Apr 2024

11 min read

Subscribe to our BI Pro newsletter for the latest insights. Don't miss out – sign up today!👋 Hello,Welcome to BI-Pro #49, your ultimate guide to data and BI insights! 🚀 ⏩ What's Inside? Python Simplified: Master data validation with Pydantic. Visualize Like a Pro: 30+ tools for stunning data visuals. R for Bioinformatics: Custom visuals for bio data. Interactive Data: JavaScript meets Handsontable. Seaborn Stories: Craft data tales with line plots. MetaGPT Insights: Next-gen data solutions unveiled. 🏭 Industry Scoop: Power BI’s Latest: March's must-know features. Fabric Innovations: Updates and new tools from Microsoft Fabric. AWS Well-Architected Data Analytics Lens: Analytics strategies for the real world. Google Cloud Savings: Cut costs on ETL workflows. Tableau Journeys: From student to BI analyst. 💎 Expert Takes: Deep Dive into Python Deep Learning: The latest from Packt. 👉 Community Buzz: Twitch Chat Analysis, Graph Networks, LLM Data Quality, and Ethical AI: Key conversations this week! Dive into the trends shaping data and BI today! 📥 Feedback on the Weekly EditionTake our weekly survey and get a free PDF copy of our best-selling book, "Interactive Data Visualization with Python - Second Edition."📣 And here's the twist – we're tuning into YOUR frequency! Inspired by a reader's request, we're launching a column just for you. Got a burning question or a topic you're itching to dive into? Drop your suggestions in our content box – because your journey of discovery is our blueprint.We appreciate your input and hope you enjoy the book!Share your thoughts and opinions here! Cheers,Merlyn ShelleyEditor-in-Chief, PacktThanks for reading Packt BI-Pro! Subscribe for free to receive new posts and support our work.Pledge your supportSign Up | Advertise | Archives🚀 GitHub's Most Sought-After Repos🌀 man-group/ArcticDB: ArcticDB is a high-performance DataFrame database designed for Python Data Science, with a Python-centric API for Pandas DataFrames. 🌀 gradio-app/gradio: Gradio is an open-source Python package for building demos or web apps for ML models or Python functions, with easy sharing via built-in features. 🌀 Sinaptik-AI/pandas-ai: PandasAI is a Python library using generative AI to explore, clean, and analyze data with natural language queries.🌀 OpenRefine/OpenRefine: OpenRefine is a powerful Java-based tool for loading, understanding, cleaning, reconciling, and augmenting data, accessible from a web browser. 🌀 Kanaries/pygwalker: PyGWalker simplifies Jupyter Notebook workflows by converting pandas dataframes into interactive user interfaces for data analysis and visualization. 🌀 cleanlab/cleanlab: cleanlab aids in data and label cleaning by identifying issues in ML datasets automatically, enabling better model training with real-world data.Email Forwarded? Join BI-Pro Here!🔮 Data Viz with Python Libraries 🌀 Pydantic Tutorial: Data Validation in Python Made Simple. This blog tutorial explains how to use Pydantic, a data validation and serialization library in Python, to validate and serialize data classes, offering support for custom validators and Python's type hints for field validation. 🌀 30+ Data Visualization Libraries, Frameworks and Apps, Mastering Data Presentation: Explore over 30 data visualization tools like Metabase, Gephi, and Grafana, offering a range of features to transform raw data into meaningful visualizations for better decision-making in industries like tech, healthcare, finance, and marketing. 🌀 Mastering Data Visualization in R for Bioinformatics: The article delves into data visualization in R for bioinformatics, stressing its role in understanding complex biological data, communicating findings, hypothesis generation, and decision-making. It also discusses Anscombe's Quartet, highlighting the importance of visualizing data before analysis and the limitations of summary statistics. 🌀 Integrating JavaScript charting libraries with Handsontable: The article guides developers on integrating Highcharts, Recharts, and Chart.js with Handsontable for data visualization. It explains the features of each library and provides demos for creating a stock portfolio with interactive charts. 🌀 Data Visualization with Seaborn Line Plot: The article introduces Seaborn, a Python library for data visualization, built on top of Matplotlib. It covers installation and demonstrates creating single line plots and customizing styles for better presentation of data. 🌀 MetaGPT’s Data Interpreter: SOTA Open Source LLM-based Data Solutions. MetaGPT introduces its Data Interpreter, a new agent for streamlined data interpretation and analysis. The Data Interpreter employs advanced techniques for real-time data adaptability, tool integration, and logical inconsistency identification, showcasing superior performance in machine learning tasks. ⚡Stay Informed with Industry HighlightsPower BI 🌀 Power BI March 2024 Feature Summary: The Power BI update introduces visual calculation editing, data model editing in the Power BI Service, and report subscription delivery to OneDrive SharePoint. A new Microsoft Fabric certification exam, DP-600, is also available, with free certification opportunities through the Fabric AI Skills Challenge. 🌀 Announcing the Public Preview of Database Mirroring in Microsoft Fabric: Mirroring, now in Public Preview, allows seamless integration of databases into Microsoft Fabric's OneLake, providing real-time insights without ETL. It simplifies data replication and warehousing, enabling easy data access and analysis across different sources, including data lakes and warehouses. 🌀 Get data with Power Query available in Power BI Report Builder (Preview): Power BI Report Builder now allows connecting to 100+ data sources like Snowflake, Databricks, and AWS Redshift. You can transform data using M-Query for paginated reports. Install the latest version and connect from the "Data" tab. Microsoft Fabric🌀 Microsoft Fabric March 2024 Update: This update brings new features like OneLake File Explorer, Autotune Query Tuning, and Test Framework for Power Query SDK in VS Code to Power BI, enhancing reporting, modeling, service, mobile, and developer experiences. 🌀 Data Factory Adds CI/CD to Fabric Data Pipelines: Fabric engineers with Azure Synapse Analytics and Azure Data Factory experience can now utilize Git integration and built-in Deployment Pipelines in Data Factory data pipelines in Fabric. This public preview offers source control, CI/CD features, and collaborative development environments, enhancing data analytics projects. 🌀 Microsoft Fabric Lifecycle Management – Getting started with Git Integration and Deployment Pipelines: Microsoft Fabric makes Lifecycle Management easy, enabling continuous releases through Git and Deployment Pipelines. Git allows reliable updates for supported items like Lakehouse, Notebooks, and Reports, while Deployment Pipelines clone content between stages like DEV, TEST, UAT, and PROD. AWS BI 🌀 Announcing the AWS Well-Architected Data Analytics Lens: The Data Analytics Lens helps assess and improve analytics platforms on AWS. It offers best practices, such as building ACID-compliant data lakes and leveraging Serverless for data pipelines, aligned with the AWS Well-Architected Framework's pillars for secure, efficient, and cost-effective solutions. 🌀 Improve healthcare services through patient 360: A zero-ETL approach to enable near real-time data analytics. The post discusses how healthcare providers can improve patient care by leveraging AWS services for real-time analytics and personalized healthcare, focusing on a zero-ETL approach to data integration.Google Cloud Dat🌀 Enrich streaming data in Bigtable with Dataflow: The post discusses the importance of event stream processing in data engineering and introduces Apache Beam's Enrichment transform, which simplifies the process of enriching streaming data with Bigtable, improving data context and enabling more meaningful analysis.🌀 Dataflow at-least-once vs. exactly-once streaming modes: The post compares exactly-once and at-least-once processing modes in Dataflow Streaming Engine for streaming jobs. It explains the trade-offs between the two modes and provides guidance on choosing the right mode based on use case requirements. Tableau🌀 Data is both art and science - My Tableau Story: Andy Cotgreave. The post highlights Andy Cotgreave's journey from a data analyst at Oxford to becoming a Senior Technical Evangelist at Tableau. It emphasizes the importance of community engagement, innovation, building a portfolio, and having fun in data visualization. 🌀 Student to BI Analyst, How Tableau Can Lead to a Successful Data Career: This blog discusses Karolina Grodzinska's data visualization journey, from discovering Tableau to winning Iron Viz: Student Edition and becoming a Business Intelligence Analyst at Schneider Electric. Karolina emphasizes the importance of an active Tableau Public profile in career development and shares tips for building a strong portfolio and networking with the Tableau Community. ✨ Expert Insights from Packt CommunityPython Deep Learning - Third Edition - By Ivan VasilevDeveloping NN models for edge devices with TF Lite TF Lite is a TF-derived set of tools that allows us to run models on mobile, embedded, and edge devices. Its versatility is part of TF’s appeal for industrial applications (as opposed to research applications, where PyTorch dominates).The key paradigm of TF Lite is that the models run on-device, contrary to client-server architecture, where the model is deployed on remote, more powerful, hardware. This organization has the following implications (both good and bad): Low-latency execution: The lack of server-round trip significantly reduces the model inference time and allows us to run real-time applications. Privacy: The user data never leaves the device. Internet connectivity: Internet connectivity is not required. Small model size: The devices have limited computational ability, hence the need for small and computationally efficient models. More specifically, TF Lite models are stored in the FlatBuffers (https://flatbuffers.dev/) special efficient portable format, identified by the .tflite file extension. Besides its small size, it allows us to access data directly without parsing/unpacking it first. TF Lite models support a subset of the TF Core operations and allow us to define custom ones: Low power consumption: The devices often run on battery. Divergent training and inference: NN training is a lot more computationally intensive compared to inference. Because of this, the model training runs on a different, more powerful, piece of hardware than the actual devices, where the models will run inference. In addition, TF Lite has the following key features: Multi-platform and multi-language support, including Android (Java), iOS (Objective-C and Swift) devices, web (JavaScript), and Python for all other environments. Google provides a TF Lite wrapper API called MediaPipe Solutions (https://developers.google.com/mediapipe, https://github.com/google/mediapipe/), which supersedes the previous TF Lite API. Optimized for performance. It has end-to-end solution pipelines. TF Lite is oriented toward practical applications, rather than research. Because of this, it includes different pipelines for common ML tasks such as image classification, object detection, text classification, and question answering among others. The computer vision pipelines use modified versions of EfficientNet or MobileNet, and the natural language processing pipelines use BERT-based models. So, how does TF Lite model development work? First, we’ll select a model in one of the following ways: An existing pre-trained .tflite model (https://tfhub.dev/s?deployment-format=lite). Use MediaPipe Model Maker (https://developers.google.com/mediapipe/solutions/model_maker) to apply feature engineering transfer learning on an existing .tflite model with a custom training dataset. Model Maker only works with Python. Convert a full-fledged TF model into .tflite format. Discover more insights from 'Python Deep Learning - Third Edition' by Ivan Vasilev. Unlock access to the full book and a wealth of other titles with a 7-day free trial in the Packt Library. Start exploring today! Read Here💡 What's the Latest Scoop from the BI Community? 🌀 Real-Time Twitch Chat Sentiment Analysis with Apache Flink: This blog explores building a real-time sentiment analysis application for Twitch chat using Apache Flink. It covers setting up the project, reading Twitch chat messages, performing sentiment analysis, and concludes with a demo. 🌀 Entity Type Prediction with Relational Graph Convolutional Network (PyTorch): This post discusses a Python setup for predicting entity types on heterogeneous graphs using the Relational Graph Convolutional Network (R-GCN) and the RGCNConv module from PyTorch. It explains knowledge graphs, entity type prediction, and the R-GCN model. 🌀 Data Quality Error Detection powered by LLMs: This article explores automating the identification of data errors in tabular datasets using Large Language Models (LLMs). It discusses the Data Dirtiness Score, challenges in data cleaning, and the potential of LLMs in detecting data quality issues. 🌀 Building Ethical AI Starts with the Data Team — Here’s Why: This article discusses the ethical considerations of AI, focusing on model bias, AI usage, and data responsibility. It emphasizes the role of data teams in ensuring ethical AI and suggests steps for data teams to take towards a more ethical future. See you next time!

0
0
23293

article-image-implementing-a-rag-enhanced-cookbot-part-1

Bahaaldine Azarmi, Jeff Vestal

15 Jan 2024

10 min read

Implementing a RAG-enhanced CookBot - Part 1

Bahaaldine Azarmi, Jeff Vestal

15 Jan 2024

10 min read

Dive deeper into the world of AI innovation and stay ahead of the AI curve! Subscribe to our AI_Distilled newsletter for the latest insights. Don't miss out – sign up today!This article is an excerpt from the book, Vector Search for Practitioners with Elastic, by Bahaaldine Azarmi and Jeff Vestal. Optimize your search capabilities in Elastic by operationalizing and fine-tuning vector search and enhance your search relevance while improving overall search performanceIntroductionEveryone knows the age-old culinary dilemma, “What can I cook with the ingredients I have?” Many people need help when faced with an array of ingredients but a lack of inspiration or knowledge to whip up a dish. This everyday issue was the spark for our idea—CookBot.CookBot is not just any AI. It’s conceived as an advanced culinary assistant that not only suggests recipes based on the available ingredients but also understands the nuances of user queries, adapts to individual dietary preferences and restrictions, and generates insightful culinary recommendations.Our objective was to infuse CookBot with RAG, ELSER, and RRF technologies. These technologies are designed to enhance the semantic understanding of queries, optimize information retrieval, and generate relevant, personalized responses. By harnessing the capabilities of these advanced tools, we aimed for CookBot to be able to provide seamless, context-aware culinary assistance tailored to each user’s unique needs.Figure: CookBot powered by ElasticDataset overview – an introduction to the Allrecipes.com datasetThe Allrecipes.com dataset, in its raw CSV format, is a treasure trove of diverse and detailed culinary information. Thus, it is the perfect foundation to train our CookBot. It houses an extensive range of recipes, each encapsulated in a unique entry brimming with an array of information.You can find and download the dataset here, as it will be used later in the chapter:https://www.kaggle.com/datasets/nguyentuongquang/all-recipesTo illustrate the richness of this dataset, let’s consider a single entry:"group","name","rating","n_rater","n_ reviewer","summary","process","ingredient" "breakfast-and-brunch.eggs.breakfast-burritos","Ham and Cheese Breakfast Tortillas",0,0,44,"This is great for a special brunch or even a quick and easy dinner. Other breakfast meats can be used, but the deli ham is the easiest since it is already fully cooked.","prep: 30 mins,total: 30 mins,Servings: 4,Yield: 4 servings","12 eggs + <U+2153> cup milk + 3 slices cooked ham, diced + 2 green onions, minced + salt and pepper to taste + 4 ounces Cheddar cheese, shredded + 4 (10 inch) flour tortillas + cup salsa" Each entry in the dataset represents a unique recipe and encompasses various fields:group: This is the categorization of the recipe. It provides a general idea about the type and nature of the dish.name: This is the title or name of the recipe. This field straightforwardly tells us what the dish is. • rating and n_rater: These fields indicate the popularity and approval of the dish among users.n_reviewer: This is the number of users who have reviewed the recipe.summary: This is a brief overview or description of the dish, often providing valuable context about its taste, usage, or preparation.process: This field outlines crucial details such as preparation time, total cooking time, servings, and yield.ingredient: This is a comprehensive list of all the ingredients required for the recipe, along with their quantities.The detailed information offered by each field gives us a broad and varied information space, aiding the retriever in navigating the data and ensuring the generator can accurately respond to a diverse range of culinary queries. As we move forward, we will discuss how we indexed this dataset using Elasticsearch, the role of ELSER and RRF in effectively retrieving data, and how the GPT-4 model generates relevant, personalized responses based on the retrieved data.Preparing data for RAG-enhanced searchTo transform the Allrecipes.com data into a searchable database, we first need to parse the CSV file and subsequently create an Elasticsearch index where data will be stored and queried. Let’s walk through this process implemented as part of the Python code.Connecting to ElasticsearchFirst, we need to establish a connection with our Elasticsearch instance. This connection is handled by the Elasticsearch object from the Elasticsearch Python module:from elasticsearch import Elasticsearch es = Elasticsearch()In this case, we assume that our Elasticsearch instance runs locally with default settings. If it doesn’t, we will need to provide the appropriate host and port information to the Elasticsearch class.Defining the indexThe next step is to define an index where our recipes will be stored. An index in Elasticsearch is like a database in traditional database systems. In this case, we’ll call our index recipes:index_name = 'recipes'Creating the mappingNow, we need to create a mapping for our index. A mapping is like a schema in a SQL database and defines the types of each field in the documents that will be stored in the index. We will define a mapping as a Python dictionary:mapping = { "mappings": { "properties": { "group": { "type": "text" }, "name": { "type": "text" }, "rating": { "type": "text" }, "n_rater": { "type": "text" }, "n_reviewer": { "type": "text" }, "summary": { "type": "text", "analyzer": "english" }, "process": { "type": "text" }, "ingredient": { "type": "text", }, "ml.tokens": { "type": "rank_features" } } } } Here, all fields are defined as text, which means they are full-text searchable. We also specify that the summary field should be analyzed using the English analyzer, which will help to optimize searches in English text by taking into account things such as stemming and stop words. Finally, we create the field that ELSER will use to create the token set, which is the result of the expansion happening based on the terms passed to ELSER.Creating the indexOnce we’ve defined our mapping, we can create the index in Elasticsearch with the following:es.indices.create(index=index_name, body=mapping)This sends a request to Elasticsearch to create an index with the specified name and mapping.Reading the CSV fileWith our index ready, we can now read our dataset from the CSV file. We’ll use pandas, a powerful data manipulation library in Python, to do this:import pandas as pd with open('recipe_dataset.csv', 'r', encoding='utf-8', errors='ignore') as file: df = pd.read_csv(file)This code opens the CSV file and reads it into a pandas dataframe, a two-dimensional tabular data structure that’s perfect for manipulating structured data.Converting to dictionariesTo index the data into Elasticsearch, we need to convert our dataframe into a list of dictionaries, where each dictionary corresponds to a row (i.e., a document or recipe) in the dataframe:recipes = df.to_dict('records') print(f"Number of documents: {len(recipes)}")At this point, we have our dataset ready to index in Elasticsearch. However, considering the size of the dataset, it is advisable to use the bulk indexing feature for efficient data ingestion. This will be covered in the next section.Bulk indexing the dataLet’s look into the step-by-step process of bulk indexing your dataset in Elasticsearch.Defining the preprocessing pipelineBefore we proceed to bulk indexing, we need to set up a pipeline to preprocess the documents. Here, we will use the elser-v1-recipes pipeline, which utilizes the ELSER model for semantic indexing. The pipeline is defined as follows:[ { "inference": { "model_id": ".elser_model_1", "target_field": "ml", "field_map": { "ingredient": "text_field" }, "inference_config": { "text_expansion": { "results_field": "tokens" } } } } ]The pipeline includes an inference processor that uses the ELSER pre-trained model to perform semantic indexing. It maps the ingredient field from the recipe data to the text_field object of the ELSER model. The output (the expanded tokens from the ELSER model) is stored in the tokens field under the ml field in the document.Creating a bulk indexing sequenceGiven the size of the Allrecipes.com dataset, it’s impractical to index each document individually. Instead, we can utilize Elasticsearch’s bulk API, which allows us to index multiple documents in a single request. First, we need to generate a list of dictionaries, where each dictionary corresponds to a bulk index operation:bulk_index_body = [] for index, recipe in enumerate(recipes): document = { "_index": "recipes", "pipeline": "elser-v1-recipes", "_source": recipe } bulk_index_body.append(document)In this loop, we iterate over each recipe (a dictionary) in our recipes list and then construct a new dictionary with the necessary information for the bulk index operation. This dictionary specifies the name of the index where the document will be stored (recipes), the ingest pipeline to be used to process the document (elser-v1-recipes), and the document source itself (recipe).Performing the bulk index operationWith our bulk_index_body array ready, we can now perform the bulk index operation:try: response = helpers.bulk(es, bulk_index_body, chunk_size=500, request_timeout=60*3) print ("\nRESPONSE:", response) except BulkIndexError as e: for error in e.errors: print(f"Document ID: {error['index']['_id']}") print(f"Error Type: {error['index']['error']['type']}") print(f"Error Reason: {error['index']['error']['reason']}")We use the helpers.bulk() function from the Elasticsearch library to provide our Elasticsearch connection (es)—the bulk_index_body array we just created—with a chunk_size value of 500 (which specifies that we want to send 500 documents per request) and a request_timeout value of 180 seconds, which specifies that we want to allow each request to take up to 3 minutes before timing out because the indexing could take a long time with ELSER.The helpers.bulk() function will return a response indicating the number of operations attempted and the number of errors, if any.If any errors occur during the bulk index operation, these will be raised as BulkIndexError. We can catch this exception and iterate over its errors attribute to get information about each individual error, including the ID of the document that caused the error, the type of error, and the reason for it.At the end of this process, you will have successfully indexed your entire Allrecipes.com dataset in Elasticsearch, ready for it to be retrieved and processed by your RAG-enhanced CookBot.ConclusionIn closing, the infusion of RAG, ELSER, and RRF technologies into CookBot elevates culinary exploration. With Elasticsearch indexing and the Allrecipes.com dataset, CookBot transcends traditional kitchen boundaries, offering personalized, context-aware assistance. This journey signifies the convergence of cutting-edge AI and the rich tapestry of culinary possibilities. As CookBot orchestrates flavor symphonies, the future of cooking is redefined, promising a delightful harmony of technology and gastronomy for every user. Embrace the evolution—where CookBot's intelligence transforms mere ingredients into a canvas for culinary innovation.Author BioBahaaldine Azarmi, Global VP Customer Engineering at Elastic, guides companies as they leverage data architecture, distributed systems, machine learning, and generative AI. He leads the customer engineering team, focusing on cloud consumption, and is passionate about sharing knowledge to build and inspire a community skilled in AI.Jeff Vestal has a rich background spanning over a decade in financial trading firms and extensive experience with Elasticsearch. He offers a unique blend of operational acumen, engineering skills, and machine learning expertise. As a Principal Customer Enterprise Architect, he excels at crafting innovative solutions, leveraging Elasticsearch's advanced search capabilities, machine learning features, and generative AI integrations, adeptly guiding users to transform complex data challenges into actionable insights.

0
0
23129

article-image-leveraging-google-cloud-for-custom-endpoint-with-openai

Henry Habib

20 Feb 2024

8 min read

Leveraging Google Cloud for Custom Endpoint with OpenAI

Henry Habib

20 Feb 2024

8 min read

Dive deeper into the world of AI innovation and stay ahead of the AI curve! Subscribe to our AI_Distilled newsletter for the latest insights. Don't miss out – sign up today!This article is an excerpt from the book, OpenAI API Cookbook, by Henry Habib. Integrate the ChatGPT API into various domains ranging from simple wrappers to knowledge-based assistants, multi-model, and conversational applicationsIntroductionIn the realm of application development, the integration of OpenAI's capabilities through custom backend endpoints on Google Cloud is a pivotal step towards unlocking intelligent solutions. This article explores the process of creating such endpoints using Google Cloud's Cloud Functions, allowing for control, customization, and the seamless integration of OpenAI's features. Through this fusion of technologies, developers can craft innovative applications that leverage the power of artificial intelligence to enhance user experiences and drive creativity in their projects.Creating a public endpoint server that calls the OpenAI APIThere are many important benefits of creating your own public endpoint server that calls the OpenAI API, instead of connecting to the OpenAI API directly – the biggest being control and customization, which we will explore in this recipe and the next recipe.In this recipe, we will use GCP to host our public endpoint. When this endpoint is called, it will make a request to OpenAI for a slogan for an ice cream company and then will return the answer to the user. This sounds simple and almost unnecessary to make a public endpoint, but it is the final step we need to build a truly intelligent application that leverages OpenAI.To do this, we will create a GCP resource called Cloud Functions, which we will explore later in the How it works… section of the recipe.Getting readyEnsure you have an OpenAI platform account with available usage credits. If you don’t, please follow the Setting up your OpenAI API Playground environment recipe in Chapter 1. Furthermore, ensure you have created a GCP account. To do this, navigate to https://cloud. google.com/, then select Start Free from the top right, and follow the instructions that you see.You may need to provide a billing profile as well to create any GCP resources. Note that GCP does have a free tier, and in this recipe, we will not go above the free tier (so, essentially, you should not be billed for anything).You may need to create a project if this is your first time logging into Google Cloud Platform. After you log in, select Select a project from the top left and then select New Project. Provide a project name and then select Create.The next recipe in this chapter will also have this same requirement.How to do it…1. Navigate to https://console.cloud.google.com/. In the Search field at the top of the page, type in Cloud Functions and select the top choice from the drop-down menu, Cloud Functions.Figure – Cloud Functions in the dropdown2. Select Create Function from the top of the page. This will begin to create our custom backend endpoint and start the configuration steps.On the Configuration page, fill in the following steps:Environment: Select 2nd gen from the drop-down menu.Function name: Since we’re creating a backend endpoint that will produce company slogans, the function name will be slogan_creator.Region: Choose the environment location nearest you.In the Trigger menu, choose HTTPS. In the Authentication sub-menu, select Allow unauthenticated invocation. We need to check this as we are going to create a public endpoint that will be accessible from our frontend services. Figure – Sample configuration settings of a Google Cloud Function3. Select the Next button on the bottom of the page to then move on to the Code section.4. From the Runtime dropdown, select Python 3.12. This ensures that our backend endpoint will be coded using the Python programming language.5. For that Entry point option, type in create_slogan. This refers to the name of the function in Python that is called when the public endpoint is reached and triggered.6. On the left-hand side menu, you will see two files: main.py and requirements.txt. Select the requirements.txt file. This will list all the Python packages that need to be installed for our Cloud Function to operate.7. In the center of the screen where the contents of requirements.txt are displayed, enter a new line and type in openai. This will ensure that the latest openai library package is installed. Your screen should look like what’s displayed in Figure below.Figure – Snapshot of the requirements.txt file8. From the left-hand side menu, select main.py. Copy and paste the following code into the center of the screen (where the content for that file is displayed). These are the instructions that the public endpoint will run when it is triggered:import functions_framework from openai import OpenAI @functions_framework.http def create_slogan(request): client = OpenAI(api_key = '<API Key here>') response = client.chat.completions.create( model="gpt-3.5-turbo", messages=[ { "role": "system", "content": "You are an AI assistant that creates one slogan based on company descriptions" }, { "role": "user", "content": "A company that sells ice cream" } ], temperature=1, max_tokens=256, top_p=1, frequency_penalty=0, presence_penalty=0 ) return response.choices[0].message.contentAs you can see, it simply calls the OpenAI endpoint, requests a chat completion, and then returns the output to the user. You will also need your OpenAI API key.9. Next, deploy the function by selecting the Deploy button at the bottom of your page.10. Wait for your function to be fully deployed, which typically takes two minutes. You can verify whether the function has been deployed or not by observing the progress in the top left section of the page (shown in Figure below). Once it is green and checkmarked, the build is successful, and your function has been deployed. Figure – The Cloud Function deployment page11. Now, let’s verify that our function works. Select the endpoint URL, found on the top of the page near URL. It’s typically in the form https://[location]-[project-name]. cloudfunctions.net/[function-name]. It is also highlighted in the above Figure.12. This will open a new web page that will trigger our custom public endpoint, and return a chat completion, which, in this case, is the slogan for an ice cream business. Note that this is a public endpoint – this will work on your computer, phone, or any device connected to the internet.Figure – Output of a Google Cloud FunctionHow it works…In this recipe, we created a public endpoint. This endpoint can be accessed by anyone (including your application in future recipes). The logic of the endpoint is simple and something we have covered prior: return a slogan for a company that sells ice cream. What’s new, however, is that this is our very own public endpoint that is hosted in Google Cloud, using the Cloud Function resource.Note that we used the free tier of Google Cloud Functions, which does have limitations such as a cap on the number of function invocations per month, limited execution time, and constrained computational resources. However, for our current purposes, these limitations are not a hindrance, allowing us to deploy and test our functions effectively without incurring costs. This setup is ideal for small-scale applications or for learning and experimentation purposes, providing a practical way to understand cloud functionalities and serverless architecture in a cost-effective manner.ConclusionIn conclusion, the synergy between Google Cloud's infrastructure and OpenAI's capabilities offers developers a powerful platform for creating intelligent applications. By leveraging Cloud Functions to build custom backend endpoints, developers can unlock a world of possibilities for innovation and creativity. This article has provided a comprehensive guide to integrating OpenAI into Google Cloud, empowering developers to craft intelligent solutions that enhance user experiences and drive the evolution of application development. With this knowledge, developers are well-equipped to embark on their journey of building intelligent applications that push the boundaries of what is possible in the digital landscape.Author BioHenry Habib is a Manager at one of the world's top management consulting firms, advising F500 companies on analytics and operations, with a particular focus on building intelligent AI-driven solutions and tools to create impact. He is a passionate online instructor and educator, amassing a of more than 150K paid students and facilitating technical programs at large banks and governmental.A proponent in the no-code and LLM revolution, he believes that anyone can now create powerful and intelligent applications without any deep technical skills. Henry resides in Toronto, Canada with his wife, and enjoys reading AI research papers and playing tennis in his free time.

0
0
22676

article-image-ai-distilled-37-cutting-edge-updates-and-expert-guidance

Merlyn Shelley

16 Feb 2024

10 min read

AI_Distilled 37: Cutting-Edge Updates and Expert Guidance

Merlyn Shelley

16 Feb 2024

10 min read

Dive deeper into the world of AI innovation and stay ahead of the AI curve! Subscribe to our AI_Distilled newsletter for the latest insights. Don't miss out – sign up today!👋 Hello ,“[Sovereign AI] codifies your culture, your society's intelligence, your common sense, your history – you own your own data…[Use AI to] activate your industry, build the infrastructure, as fast as you can.” - Jensen Huang, NVIDIA founder and CEO Huang strongly advocated for countries to rapidly develop their own national AI capabilities and systems. NVIDIA's dominance in GPUs positions it to be a major beneficiary of the AI revolution as its technologies are fundamental for running advanced AI applications. No wonder NVIDIA's market value recently surpassed Amazon’s. Embark on a new AI journey with AI_Distilled, a curated digest of the most recent developments in AI/ML, LLMs, NLP, GPT, and Gen AI. We’ll kick things off by tapping into the latest news and developments in the AI sector: OpenAI updates ChatGPT with memory retention Microsoft unveils new Copilot features Google updates Gemini and unveils mobile app Apple's new AI model called MGIE New open-source AI model Aya converses in 100+ languages Reka AI introduces two new state-of-the-art AI models DeepMind and USC develop new technique to improve LLMs’ reasoning abilities New open-source AI model Smaug-72B achieves top spot NVIDIA unveils new chatbot Chat with RTX AI helps identify birds and conserve an English wetland USPTO issues new guidance stating AI alone can't be named as an inventor We’ve also handpicked GPT and LLM resources and secret knowledge that’ll come in handy for your next project: Building a Scalable Foundation Model Platform for Your Enterprise Making Bridges Between AI and Business Evaluating Large Language Models: A Guide to Benchmark Tests Code Generation Gets Smarter with Context Looking for hands-on tips and strategies straight from the developer community? We’ve got you covered with some incredible tutorials to get you started: Building a Question Answering Bot from Scratch Creating SMS Apps with Next.js and AI Assistants Harness the Power of LLMs Without GPUs Making the Switch to Open-Source Models Finally, feel free to check out our curated list of smoking hot GitHub repositories. arplaboratory/learning-to-fly time-series-foundation-models/lag-llama noahfarr/rlx uclaml/SPIN phidatahq/phidata 📥 Feedback on the Weekly EditionTake our weekly survey and get a free PDF copy of our best-selling book, "Interactive Data Visualization with Python - Second Edition." 📣 And here's the twist – we're tuning into YOUR frequency! Inspired by a reader's request, we're launching a column just for you. Got a burning question or a topic you're itching to dive into? Drop your suggestions in our content box – because your journey of discovery is our blueprint.We appreciate your input and hope you enjoy the book! Share your thoughts and opinions here! Writer’s Credit: Special shout-out to Vidhu Jain for their valuable contribution to this week’s newsletter content! Cheers, Merlyn Shelley Editor-in-Chief, Packt SignUp | Advertise | Archives⚡ TechWave: AI/GPT News & AnalysisOpenAI (whose annual revenue hit $2 billion in December) has updated its ChatGPT chatbot to retain information from past conversations, allowing it to apply context from previous discussions to new chats. This could help the bot respond with more relevant, personalized replies. Microsoft's AI assistant Copilot has also received upgrades like improved models and image editing tools. Google updated its conversational AI too with a new mobile app and advanced model (now called Gemini). A new paid tier for Gemini Ultra provides developers and users access to more advanced features and capabilities for $20 per month. Courtesy: OpenAI Apple's new AI model called MGIE allows editing images through natural language commands, performing tasks from color adjustments to complex manipulations. Interestingly, a new open-source AI model called Aya can converse in over 100 languages, potentially increasing access for many. Reka AI has introduced two new state-of-the-art AI models, Flash and Edge, which achieve top performance on language and vision tasks while Edge maintains a smaller size. Researchers from DeepMind and USC have developed a new technique called SELF-DISCOVER to improve LLMs’ reasoning abilities. A new open-source AI model (available for all) called Smaug-72B has achieved the top spot on the leaderboard for language models, demonstrating skills that surpass proprietary competitors.Courtesy: Apple NVIDIA's market value surpassed Amazon's for the first time since 2002 thanks to strong demand for its AI chips. The company also released a new chatbot called Chat with RTX allowing users to run personalized generative AI models locally on PCs. Chatbots aside, AI is making waves across fields. Conservationists are using AI to identify birds by their songs to help restore an English wetland. Scientists are speeding discoveries and tackling climate change better as new multimodal and smaller language models enhance technologies. That said, the USPTO has issued new guidance stating that while AI alone can't be named as an inventor on patents, humans can use AI in the invention process as long as they make a significant creative contribution. 🔮 Expert Insights from Packt Community LLMs Under the Hood – Building Models for Your Unique Use Cases [Video] - By Maxime Labonne, Denis Rothman, Abi Aryan This video course is an invaluable resource for AI developers looking to master the art of building enterprise-grade Large Language Models (LLMs). Here's why it's a must-watch: Key Takeaways: 1. Expert-Led Guidance: Learn from industry experts Maxime Labonne, Dennis Rothman, and Abi Aryan, who bring a wealth of experience in LLM development. 2. End-to-End Coverage: Gain comprehensive insights into the entire LLM lifecycle, from architecture to deployment. 3. Advanced Skills Development: Acquire advanced skills to architect high-performing LLMs tailored to your specific business needs. 4. Hands-On Learning: Engage in practical exercises that reinforce key concepts and techniques for building, refining, and deploying LLMs. 5. Real-World Impact: Learn how to create LLMs that deliver tangible business value and solve complex organizational challenges. Course Highlights: - Making informed architecture decisions for optimal performance. - Selecting the right model types and configuring hyperparameters effectively. - Curating high-quality training data for better model outcomes. - Mastering pre-training, fine-tuning, and rigorous model evaluation techniques. - Strategies for smooth productionization, proactive monitoring, and post-deployment maintenance. By the end of this masterclass, you'll be equipped with the practical knowledge and skills needed to develop and deploy LLMs that drive real-world impact for your organization. Watch Here 🌟 Secret Knowledge: AI/LLM Resources🌀 Building a Scalable Foundation Model Platform for Your Enterprise: This post outlines how enterprises can provide different teams governed access to powerful foundation models through a centralized API layer. The solution described captures model usage and costs for each team to enable chargebacks. It also allows controlling access and throttling usage on a per-team basis. Building on serverless AWS services ensures the solution scales to meet demand. Whether you need transparent access for innovation or just want to understand how teams are leveraging AI, implementing a solution like this can help unlock the potential of generative AI for your whole organization. 🌀 Making Bridges Between AI and Business: This article discusses how businesses can develop an AI platform to integrate generative technologies like RAG and CRAG safely. It covers collecting data, querying knowledge bases, and using prompt engineering to guide AI models. The goal is to leverage AI's potential while avoiding risks through a blended strategy of retrieval and generation. This overview provides a solid foundation for aligning cutting-edge models with your organization's needs. 🌀 Evaluating Large Language Models: A Guide to Benchmark Tests: As AI language models become more advanced, it's important we have proper ways to assess their abilities. This article outlines several benchmark tests that evaluate language models on tasks like reasoning, code generation and more. Tests like WinoGrande, Hellaswag and GLUE provide insights into models' strengths and weaknesses. The benchmarks also allow for comparisons between different models. They give us a more complete picture of a model's skills. 🌀 Code Generation Gets Smarter with Context: Google's Codey APIs now enhance code completion and generation using Retrieval Augmented Generation, which retrieves relevant code from repositories to provide more accurate responses. This "RAG" technique allows LLMs to leverage external context. The blog post explores how RAG works and demonstrates its ability to inject appropriate code snippets. While not perfect, RAG is a useful tool to explore coding variations and adapt to custom styles when used with Codey on Vertex AI. Partnering with Notion Ever tried Notion? It's a workspace that helps you do things better and faster.You get AI for notes and teamwork, easy drag-and-drop for content, and cool new features to help manage projects and share knowledge.Give it a Try! 🔛 Masterclass: AI/LLM Tutorials🌀 Building a Question Answering Bot from Scratch: This tutorial shows you how to create a basic question answering bot by processing text from Wikipedia, generating embeddings with OpenAI, and storing the data in Momento Vector Index. It covers initializing clients, loading and chunking data, generating embeddings, indexing the embeddings, and searching to return answers. The bot is enhanced by using GPT-3 to provide concise responses instead of raw text. Following these steps will give you hands-on experience constructing a QA system from the ground up. 🌀 Creating SMS Apps with Next.js and AI Assistants: This article shows you how to build a texting app that uses Next.js for the frontend and backend. OpenAI's GPT-3 is utilized to generate meeting invite messages. Twilio handles sending the texts. React components collect invite details. API routes fetch GPT-3 responses and send data to Twilio. It's a clever way to enhance workflows with AI. 🌀 Harness the Power of LLMs Without GPUs: Google's new localllm tool allows developers to run large language models locally using just a CPU, eliminating the need for expensive GPUs. With localllm and Cloud Workstations, you can build AI-powered apps right in your browser-based development environment. Quantized models optimize performance on CPUs while localllm handles downloading and running the models. The post provides instructions for setting up a Cloud Workstation with localllm pre-installed to get started with this new way to develop with LLMs. 🌀 Making the Switch to Open-Source Models: Hugging Face's new Messages API allows developers to easily transition chatbots and conversational models from OpenAI to open-source options like Mixtral. The API maintains compatibility with OpenAI libraries so your code doesn't need updating. You can also deploy these models to Hugging Face's Inference Endpoints and use them with frameworks like LangChain and LlamaIndex. This unlocks greater control, lower costs and more transparency compared to closed-source options. 🚀 HackHub: Trending AI Tools🌀 arplaboratory/learning-to-fly: Train end-to-end quadrotor control policies using deep reinforcement learning on a laptop in seconds 🌀 time-series-foundation-models/lag-llama: Open-source foundation model for probabilistic time series forecasting to perform zero-shot predictions on new time series data and eventually fine-tune for their specific forecasting needs 🌀 noahfarr/rlx: Implements reinforcement learning algorithms using Apple's MLX framework, making the code optimized to run on M-series chips 🌀 uclaml/SPIN: Implement Self-Play Fine-Tuning to enhance language models through self-supervised learning 🌀 phidatahq/phidata: Open-source toolkit for building AI assistants using function calling Affiliate Disclosure: This newsletter contains affiliate links. If you buy through them, we may earn a small commission at no extra cost to you. This supports our work and helps us keep providing useful content. We only recommend products and services we think will benefit our readers. Thanks for your support!

0
0
22518

article-image-the-transformative-potential-of-google-bard-in-healthcare

Julian Melanson

20 Jun 2023

6 min read

The Transformative Potential of Google Bard in Healthcare

Julian Melanson

20 Jun 2023

6 min read

The integration of artificial intelligence and cloud computing in healthcare has revolutionized various aspects of patient care. Telemedicine, in particular, has emerged as a valuable resource for remote care, and the Google Bard platform is at the forefront of making telemedicine more accessible and efficient. With its comprehensive suite of services, secure connections, and advanced AI capabilities, Google Bard is transforming healthcare delivery. This article explores the applications of AI in healthcare, the benefits of Google Bard in expanding telemedicine, its potential to reduce costs and improve access to care, and its impact on clinical decision-making and healthcare quality.AI in HealthcareArtificial intelligence has made significant strides in revolutionizing healthcare. It has proven to be instrumental in disease diagnosis, treatment protocol development, new drug discovery, personalized medicine, and healthcare analytics. AI algorithms can analyze vast amounts of medical data, detect patterns, and identify potential risks or effective treatments that may go unnoticed by human experts. This technology holds immense promise for improving patient outcomes and enhancing healthcare delivery.Understanding Google BardGoogle's AI chat service, Google Bard, now employs its latest large language model, PaLM 2, launched at Google I/O 2023. As a progression from the original PaLM released in April 2022, it elevates Bard's efficacy and performance. Initially, Bard operated on a scaled-down version of LaMDA for broader user accessibility with minimal computational needs. Leveraging Google's extensive database, Bard aims to provide precise text responses to a variety of inquiries, poised to become a dominant player in the AI chatbot arena. Despite its later launch than ChatGPT, Google's broader data access and Bard's lower computational requirements potentially position it as a more efficient and user-friendly contender.Expanding Telemedicine with Google Bard:Google Bard has the ability to play a pivotal role in expanding telemedicine and remote care. Its secure connections enable doctors to diagnose and treat patients without being physically present. Additionally, it provides easy access to patient records, and medical history, and facilitates seamless communication through appointment scheduling, messaging, and sharing medical images. These features empower doctors to provide better-informed care and enhance patient engagement.The Impact on Healthcare Efficiency and Patient Outcomes:Google Bard's integration of AI and machine learning capabilities elevate healthcare efficiency and patient outcomes. The platform's AI system quickly and accurately analyzes patient records, identifies patterns and trends, and aids medical professionals in developing effective treatment plans.For example, here is a prompt based on a sample HPI text:Image 1: Diagnosis from this clinical HPIHere is Google Bard’s response:Image 2: main problems of the Clinical HPIGoogle Bard effectively identified key diagnoses and the primary issue from the HPI sample text. Beyond simply listing diagnoses, it provided comprehensive details on the patient's primary concerns, aiding healthcare professionals in grasping the depth and complexity of the condition.When prompted further, Google Bard also suggested a treatment strategy:Image 3: Suggesting treatment planThis can ultimately assist medical practitioners in developing fitting interventions.Google Bard also serves as a safety net by detecting potential medical errors and alerting healthcare providers. Furthermore, Google Bard's user-friendly interface expedites access to patient records, enabling faster and more effective care delivery. The platform also grants medical professionals access to the latest medical research and clinical trials, ensuring they stay updated with advancements in healthcare. Ultimately, Google Bard's secure platform and powerful analytics tools contribute to better patient outcomes and informed decision-making.Reducing Healthcare Costs and Improving Access to CareOne of the key advantages of Google Bard lies in its potential to reduce healthcare costs and improve access to care. The platform's AI-based technology identifies cost-efficient treatment options, optimizes resource allocation, and enhances care coordination. By reducing wait times and unnecessary visits, Google Bard minimizes missed appointments, repeat visits, and missed diagnoses, resulting in lower costs and improved access to care. Additionally, the platform's comprehensive view of patient health, derived from aggregated data, enables more informed treatment decisions and fewer misdiagnoses. This integrated approach ensures better care outcomes while controlling costs.Supporting Clinical Decision-Making and Healthcare QualityGoogle Bard's launch signifies a milestone in the use of AI to improve healthcare. The platform provides healthcare providers with a suite of computational tools, empowering them to make more informed decisions and enhance the quality of care. Its ability to quickly analyze vast amounts of patient data enables the identification of risk factors, potential diagnoses, and treatment recommendations. Moreover, Google Bard supports collaboration and comparisons of treatment options among healthcare teams. By leveraging this technology, healthcare professionals can provide personalized care plans, improving outcomes and reducing medical errors. The platform's data-driven insights and analytics also support research and development efforts, allowing researchers to identify trends and patterns and develop innovative treatments.Google Bard, with its AI-driven capabilities and secure cloud-based platform, holds immense potential to revolutionize healthcare delivery. By enhancing efficiency, accessibility, and patient outcomes, it is poised to make a significant impact on the healthcare industry. The integration of AI, machine learning, and cloud computing in telemedicine enables more accurate diagnoses, faster treatments, and improved care coordination. Moreover, Google Bard's ability to reduce healthcare costs and improve access to care reinforces its value as a transformative tool. As the platform continues to evolve, it promises to shape the future of healthcare, empowering medical professionals and benefiting patients worldwide.SummaryGoogle Bard is revolutionizing healthcare with its practical applications. For instance, it enhances healthcare efficiency and improves patient outcomes through streamlined data management and analysis. Reducing administrative burdens and optimizing workflows, it reduces healthcare costs and ensures better access to care. Furthermore, it supports clinical decision-making by providing real-time insights, aiding healthcare professionals in delivering higher-quality care. Overall, Google Bard's transformative technology is reshaping healthcare, benefiting patients, providers, and the industry as a whole.Author BioJulian Melanson is one of the founders of Leap Year Learning. Leap Year Learning is a cutting-edge online school that specializes in teaching creative disciplines and integrating AI tools. We believe that creativity and AI are the keys to a successful future and our courses help equip students with the skills they need to succeed in a continuously evolving world. Our seasoned instructors bring real-world experience to the virtual classroom and our interactive lessons help students reinforce their learning with hands-on activities.No matter your background, from beginners to experts, hobbyists to professionals, Leap Year Learning is here to bring in the future of creativity, productivity, and learning!

0
0
22353

article-image-writing-improvised-unit-test-cases-with-github-copilot

Mr. Avinash Navlani

08 Jun 2023

6 min read

Writing improvised unit test cases with GitHub Copilot

Mr. Avinash Navlani

08 Jun 2023

6 min read

In the last couple of years, AI and ML have provided various tools and methodologies to transform the working styles of software developers. GitHub Copilot is an AI-powered tool that offers code suggestions for rapid software development. GitHub Copilot can be a game changer for developers across the world. It can increase productivity by recommending code snippets and auto-completions. In this tutorial, we will discuss the Unit tests, GitHub Copilot Features, account creation, setup VS code, JavaScript example, Python example, unit test example, and GitHub Copliot advantages and disadvantages. Setup your account at GitHub Copilot Account Let’s first create your account with GitHub Copilot and then use that account in IDE such as VS Code. We can set up your account from the following web page GitHub Copilot · Your AI pair programmer Image 1: GitHub Copilot · Your AI pair programmer. Click on the Get Copilot button and start your 30-day free trial but we need to enter the credit/debit card or pay-pal account details. After this, activate your GitHub Copilot account by following the instruction mentioned in the GitHub quickstart guide for individual or organization access. Image 2: Activate Github Copilot in Settings Setup in Visual Studio Code The first step is to download and install the Visual Studio Code from the official download page. Install the GitHub Copilot extension by following the instruction in the quickstart guide. We can directly search for ext install GitHub.copilot in Quick file navigation (shortcut key: Ctrl + P) Image3: Visual Studio Home Login with your GitHub credentials as mentioned in the quickstart guide. Create a Python Function is_prime In this section, we create the function is_prime with Copilot recommendation and see how it can help software developers in their coding journey. For executing any Python code, we need a Python file (with .py an extension). This file will help the interpreter to identify that this is a Python script. First, we create a Python file (prime_number.py) from the file tab by selecting the new file option or we can press Alt + Ctrl + N. After creating the Python script file, we will create the is_prime(number =) function. This is_prime() the function will check whether a number is prime or not. We can see in the below snapshot Copilot also suggests the docstring for the function as per the name of the function: Image 4: Doc-string suggestions from the Copilot After creating the file, we added the is_prime function but on writing the logic for the is_prime() function Copilot is suggesting an internal if condition inside for loop of the is_prime() function. We can see that suggested if statement in the below snapshot. Image 5: Github Copilot suggests an if statement After suggesting the if statement, Copilot suggests the return value True as shown below snapshot: Image 6: Github Copilot suggests return value This is how we can write the Python function is_prime. We can also write similar functions as per your project requirement. Writing code with GitHub Copilot makes it much faster and more productive for a developer by providing code snippet suggestions and auto completions. Write Unit Tests for Python Function is_prime Unit testing is a very important component of the coding module that builds confidence and faith in software developers and ensures bug-free and quality code. The main objective of this testing is to ensure the expected functionality at individual and isolated component-level. Let’s see how we can define them using the AI-powered code suggestion platform GitHub Copilot: First, we will create a function unit_test_prime_number() for unit testing the is_prime() function. Here, we can see how Copilot suggests the doc-string in the below snapshot: Image 7: Github Copilot suggests a doc-string for the unit_test method After writing the doc-string, we will add the assertion condition for the unit test. As we can see Copilot is auto-suggesting the assertion condition for the test in the below snapshot: Image 8: Github Copilot suggests assertion condition Similarly, Copilot is suggesting the other assertion conditions with different possible inputs for testing the is_prime() function as shown below snapshot: Image 9: Github Copilot suggests other assertion conditions This is how GitHub Copilot can help us in writing faster unit-test. It speeds up development and increases developer productivity through code suggestions and auto-completions. Let’s see the final output in the below coding snapshot: Image 10: A final program created with GitHub Copliot suggestions In the above snapshot, we can see the complete code for the is_prime() function and unit_test_prime_number(). Also, we can see the output of the unit_test_prime_number() function in the Terminal. Advantages and Disadvantages of the Copilot Unit Test GitHub Copilot offers the following advantages and disadvantages: Improve productivity and reduce development time by offering efficient test cases. Developers can focus on more significant problems rather focus on small code snippets for unit testing. A novice coder can also easily utilize its capability to ensure quality development. Sometimes it may suggest irrelevant or vulnerable code because it doesn’t have proper context and user behavior. As a developer, we also need to consider how edge cases are covered in unit tests. Sometimes it may lead to syntax errors, so we must verify the syntax. Summary GitHub Copilot revolutionized the way programmers write code. It offers AI-based code suggestions and auto-completions. Copilot boosts developer productivity and rapid application development by providing accurate, relevant, robust, and quality code. Also, sometimes it may cause problems in covering the edge cases. In the above tutorial, I have shared examples from JavaScript and Python languages. We can also try other languages such as TypeScript, Ruby, C#, C++, and Go. Reference GitHub Copilot Quick Start Guide: https://docs.github.com/en/copilot/quickstart Unit tests https://www.freecodecamp.org/news/how-to-write-unit-tests-for-python-functions/ Author BioAvinash Navlani has over 8 years of experience working in data science and AI. Currently, he is working as Sr. Data scientist, Improving products and services for customers by using advanced analytics, deploying big data analytical tools, creating and maintaining models, and onboarding compelling new datasets. Previously, he was a Lecturer at the university level, where he trained and educated people in Data science subjects such as python for analytics, data mining, machine learning, Database Management, and NoSQL. Avinash has been involved in research activities in Data science and has been a keynote speaker at many conferences in India. LinkedIn Python Data Analysis, Third Edition

0
0
21439

article-image-streamlining-insights-with-microsoft-copilot

Gus Frazer

04 Mar 2024

9 min read

Streamlining Insights with Microsoft Copilot

Gus Frazer

04 Mar 2024

9 min read

Dive deeper into the world of AI innovation and stay ahead of the AI curve! Subscribe to our AI_Distilled newsletter for the latest insights. Don't miss out – sign up today!This article is an excerpt from the book, Data Cleaning with Power BI, by Gus Frazer. Unlock the full potential of your data by mastering the art of cleaning, preparing, and transforming data with Power BI for smarter insights and data visualizationsIntroductionFor those who have never heard of Microsoft Copilot, it is a new technology that Microsoft has released across a number of its platforms that combines generative AI with your data to enhance productivity. Copilot for Power BI harnesses cutting-edge generative AI alongside your dataset, revolutionizing the process of uncovering and disseminating insights with unprecedented speed. Seamlessly integrated into your workflow, Copilot offers an array of functionalities aimed at streamlining your reporting experience.When it comes to report creation, Copilot streamlines the process by allowing users to effortlessly generate reports by articulating the insights they seek or posing questions regarding their dataset using NLP. Copilot then analyzes the data, pulling together relevant information to craft visually striking reports, thereby transforming raw data into actionable insights instantaneously. Moreover, Copilot has the ability to read your data and suggest the best position to begin your analysis, which can then be tailored to suit the direction you want to take the analysis in.This is great, but how can it help you clean and prepare data for analysis? Well, Copilot can be leveraged on multiple data tools from within the Microsoft Fabric platform. For those who are not aware, Power BI has now become part of the Fabric platform. Depending on what type of license you have for Power BI, you might already have access to this. Any customers with Premium capacity licensing for the Power BI service would have automatically been given access to Microsoft Fabric, and more importantly, Copilot.That being said, currently, Copilot has only been made available to customers with a P1 (or above) Premium capacity or a Fabric license of F64 (or above), which is the equivalent licensing available directly from the Azure portal.If you would like to follow along with the next example, you will need to set up a Fabric capacity within your Azure portal. Don’t worry, you can pause this service when it’s not being used to ensure you are only charged for the time you’re using it. Alternatively, follow the steps to see the outcome:1. Log in to the Azure portal that you set up in the previous section of this chapter.2. Select the search bar at the top of the page and type in Microsoft Fabric. Select the service in the menu that appears below the search bar, which should take you to the page where you can manage your capacities.3. Select Create a Fabric capacity. Note that you will need to use an organizational account in order to create a Fabric capacity as opposed to a personal account. You can sign up for a Microsoft Fabric trial for your organization within the window. Further details on how to do this are provided here: https://learn.microsoft.com/en-us/power-bi/ enterprise/service-admin-signing-up-for-power-bi-with-a-newoffice-365-trial.4. Select the subscription and resource group you would like to use for this Fabric capacity.5. Then, under capacity details, you can enter your capacity name. In this example, you can call it cleaningdata.6. The Region field should populate with the region of your tenant, but you can change this if you like. However, this may have implications on performance, which it should warn you about with a message.7. Set the capacity to F64.8. Then, click on select Review + create.9. Review the terms and then click on Create, which will begin the deployment of your capacity.10. Once deployed, select Go to resource to view your Fabric capacity. Take note that this will be active once deployed. Make sure to return here aft er testing to pause or delete your Fabric capacity to prevent yourself from getting charged for this service.Now you will need to ensure you have activated the Copilot settings from within your Fabric capacity. To do this, go to https://app.powerbi.com/admin-portal/ to log in and access the admin portal.Important tipIf you can’t see the Tenant settings tab, then you will need to ensure you have been set up as an admin within your Microsoft 365 admin center. If you have just created a new account, then you will need to set this up. Follow the next links to assign roles:• https://learn.microsoft.com/en-us/microsoft-365/admin/addusers/assign-admin-roles• https://learn.microsoft.com/en-us/fabric/admin/microsoftfabric-admin11. Scroll to the bottom of Tenant settings until you see the Copilot and Azure OpenAI service (preview) section as shown:Figure – The tenant settings from within Power BI12. Ensure both settings are set to Enabled and then click on Apply.Now that you have created your Fabric capacity, let’s jump into an example of how we can use Copilot to help with the cleaning of data. As we have created a new capacity, you will have to create a new workspace that uses this new capacity:1. Navigate back to Workspaces using the left navigation bar. Then, select New Workspace.2. Name your workspace CleaningData(Copilot), then select the dropdown for advanced configuration settings.3. Ensure you have selected Fabric capacity in the license mode, which in turn will have selected your capacity below, and then select Apply. You have now created your capacity!4. Now let’s use Fabric to create a new dataflow using the latest update of Datafl ow Gen2. Select New from within the workspace and then select More options.5. This will navigate you to a page with all the possible actions to create items within your Fabric workspace. Under Data Factory, select Datafl ow Gen2.6. This will load a Datafl ow Gen2 instance called Datafl ow 1. On the top row, you should now see the Copilot logo within the Home ribbon as highlighted:Figure – The ribbon within a Dataflow Gen2 instance7. Select Copilot to open the Copilot window on the right-hand side of the page. As you have not connected to any data, it will prompt you to select get data.8. Select Text/CSV and then enter the following into the File path or URL box:https://raw.githubusercontent.com/PacktPublishing/Data-Cleaningwith-Power-BI/main/Retail%20Store%20Sales%20Data.csv9. Leave the rest of the settings as their defaults and click on Next.10. This will then open a preview of the file data. Click on Create to load this data into your Datafl ow Gen2 instance. You will see that the Copilot window will have now changed to prompt you as to what you would like to do (if it hasn’t, then simply close the Copilot window and reopen):Figure – Data loaded into Dataflow Gen211. In this example, we can see that the data includes a column called Order Date but we don’t have a fi eld for the fi scal year. Enter the following prompt to ask Copilot to help with the transformation:There's a column in the data named Order Date, which shows when an order was placed. However, I need to create a new column from this that shows the Fiscal Year. Can you extract the year from the date and call this Fiscal Year? Set this new column to type number also.12. Proceed using the arrow key or press Enter. Copilot will then begin working on your request. As you will see in the resulting output, the model has added a function (or step) called Custom to the query that we had selected.13. Scroll to the far side and you will see that this has added a new column called Fiscal Year.14. Now add the following prompt to narrow down our data and press Enter:Can you now remove all columns leaving me with just Order ID, Order Date, Fiscal year, category, and Sales?15. This will then add another function or step called Choose columns. Finally, add the following prompt to aggregate this data and press Enter:Can you now group this data by Category, Fiscal year, and aggregated by Sum of Sales?As you can see, Copilot has now added another function called Custom 1 to the applied steps in this query, resulting in this table:Figure – The results from asking Copilot to transform the dataTo view the M query that Copilot has added, select Advanced editor, which will show the functions that Copilot has added for you:Figure – The resulting M query created by Copilot to carry out the request transformations to clean the dataIn this example, you explored the new technologies available with Copilot and how they help to transform the data using tools such as Datafl ow Gen2.While it’s great to understand the amazing possibilities AI brings to data, it’s also crucially important that you understand the challenges it presents.ConclusionIn conclusion, Microsoft Copilot offers a groundbreaking approach to enhancing productivity and efficiency in data analysis and report generation within Power BI. By seamlessly integrating generative AI technology, Copilot revolutionizes the way insights are discovered and data is prepared, providing users with unprecedented speed and accuracy. Whether streamlining report creation or optimizing data management tasks, Copilot empowers users to unlock the full potential of their data, paving the way for more informed decision-making and actionable insights.Author BioGus Frazer is a seasoned Analytics Consultant focused on Business Intelligence solutions. With over 7 years of experience working for the two market-leading platforms, Power BI & Tableau, has amassed a wealth of knowledge and expertise. Gus has helped hundreds of customers to drive their digital and data transformations, scope data requirements, drive actionable insights, and most important of all, cleanse data ready for analysis. Most recently helping to set up, organize and run the Power BI UK community at Microsoft. He holds 6 Azure and Power BI certifications, including the PL-300 and DP-500 certifications. In this book, Gus offers readers invaluable guidance on ingesting, preparing, and cleansing data for analysis in Power BI. --This text refers to an out of print or unavailable edition of this title.

0
0
21345

article-image-get-started-with-fabric-create-your-workspace-reports

Arshad Ali, Bradley Schacht

14 Mar 2024

9 min read

Get Started with Fabric: Create Your Workspace & Reports

Arshad Ali, Bradley Schacht

14 Mar 2024

9 min read

Dive deeper into the world of AI innovation and stay ahead of the AI curve! Subscribe to our AI_Distilled newsletter for the latest insights. Don't miss out – sign up today!This article is an excerpt from the book, Learn Microsoft Fabric, by Arshad Ali, Bradley Schacht. Harness the power of Microsoft Fabric to develop data analytics solutions for various use cases guided by step-by-step instructionsIntroductionEmbark on a journey to harness the full potential of Microsoft Fabric within Power BI. This article serves as your comprehensive guide, walking you through the essential steps to create your first Fabric workspace seamlessly. From understanding the fundamentals to practical implementation, we'll equip you with the knowledge and tools needed to optimize your data management and reporting processes. Get ready to elevate your Power BI experience and unlock new possibilities with Fabric-enabled workspaces.Creating your first Fabric-enabled workspaceOnce you have confirmed that Fabric is enabled in your tenant and you have access to it, the next step is to create your Fabric workspace. You can think of a Fabric workspace as a logical container that will contain items such as lakehouses, warehouses, notebooks, and pipelines. Follow these steps to create your first Fabric workspace:1. Sign into Power BI (https://app.powerbi.com/).2. Select Workspaces | + New workspace:Figure 2.5 – Creating a new workspace3. Fill out the Create a workspace form, as follows:Name: Enter Learn Microsoft Fabric and some characters for uniqueness.Description: Optionally, enter a description for the workspace:Figure 2.6 – Create a workspace – detailsAdvanced: Select Fabric capacity under License mode and then choose a capacity you have access to. If not, you can start a trial license, as described earlier, and use it here.4. Select Apply. Th e workspace will be created and opened.5. You can click on Workspaces again and then search for your workspace by typing its name in the search box. You can also pin the selected workspace so that it always appears at the top:Figure 2.7 – Searching for a workspace6. Clicking on the name of the workspace will open that workspace. A link to it will become available in the left-hand side navigation bar, allowing you to switch from one item to another quickly. Since we haven’t created anything yet, there is nothing here. You can click on +New to start creating Fabric items:Figure 2.8 – Switching to a workspaceWith a Microsoft Fabric workspace set up, let’s review the different workloads that are available.Copilot in Power BIPower BI has several key components, including data transformation and data modeling, culminating in a visual report that end users will consume. The Copilot experience is centered around the visual storytelling and reporting aspects of Power BI. This materializes in three ways: report page creation, narrative generation, and improving Q&A.Let’s look at each of these Copilot capabilities.Creating reports with the Power BI CopilotThe most common use for Copilot with Power BI is likely to be for creating reports. There are two features that come together to build reports. The first analyzes the dataset to suggest content for your report by using table relationships and column names, while the second one helps you create intuitive reports quickly. Figure 11.30 shows an example where Copilot has suggested several report pages, each with a short description of what would be displayed:Figure 11.30 – The Power BI Copilot page suggestionsIf you like the page suggestions, simply click on the Create button and the report page will appear.While a suggested set of report content is a good starting point, analysts often have a specific need to meet. You can have Copilot create a report from the criteria you provide using prompts as well. These can be as simple as “create a page that shows customer analysis” or more specific, such as “create a page to show the impact of each sales territory on profit and quantity sold.”Figure 11.31 – Sales impact report created by CopilotOnce the report page is generated, Copilot cannot update the report, but you can interact with and modify the report as necessary. This is a great way to reduce the time to get started building reports.A couple of other important things to note are that in addition to not being able to modify reports, Copilot will not allow you to specify specific visual types, apply filters, or change the report layout. All of these can be changed manually after the initial report generation. It is worth noting that users should not expect Copilot to filter results to a specific time period based on their prompt as an example.Next, let’s look at the smart narrative.Creating a narrative using CopilotVisuals are a wonderful way to tell a story and give users the ability to explore data on their own. However, sometimes a narrative that summarizes what is being displayed in a report can be useful. It can not only tell a story but also provide some additional context and information for users.To get started, open a report and add a narrative visualization to the report as shown in Figure 11.32. You will see two options; click on Copilot. Choose the type of summary you wish to produce and optionally select specific pages or visuals to include in the summary. Then click on Create.Figure 11.32 – The report narrative generated by CopilotAfter the narrative is generated, remember to always review the narrative for accuracy and adjust the prompt, if necessary, to produce more accurate results. In addition to summaries, you can ask it to highlight key information, customize the order in which the data is described to help convey importance, specify specific data points to include in the summary, and even generate impact analysis showing how different factors affect metrics on the report.Report, page, and visual narratives are a great way to guide users through a report, especially if there isn’t a subject matter expert there to explain all the data.Finally, let’s look at using Copilot to improve the Q&A visual.Generating synonyms with CopilotThe Q&A visual has been dazzling users for years at this point. It is impressive to build a model, walk into the room, and tell users that they can use natural language to query their data without needing to build any visuals. This may not be as impressive as the Copilot functionality that we have today, but it is still a very useful tool in your Power BI visualization toolbelt.One piece of important information for the success of Q&A is something called a synonym. These are end-user-specific ways to reference data. For example, a table in the data model may be called Dim Person, but you know that some report consumers always refer to these as “users.” Therefore, you would create a synonym that tells Q&A that when someone asks about users, they are really talking about persons. This can also be done on a column level. A synonym for “postal code” could be “zip code,” while a synonym for an “item” could be “product” or “finished good.”Q&A itself may not use Copilot, but Power BI Desktop can leverage Copilot to generate synonyms. This can be done when creating a new Q&A visual by clicking on Add synonyms from the ribbon with the label Improve Q&A with synonyms from Copilot. They can also be generated from the Q&A settings menu by adding Copilot as a source from the Suggestion settings list.The more synonyms that can be used to describe your data, the more likely you are to produce quality Q&A results. It is important to double-check the synonyms generated by Copilot to ensure they line up with your specific business terminology.With these Copilot experiences for Power BI, you will be able to generate report ideas, report pages and visuals, summaries, and narratives, and improve Q&A.ConclusionIn conclusion, by mastering the creation of Fabric workspaces in Power BI, you've laid a solid foundation for efficient data management and reporting. With Fabric's capabilities at your fingertips, you're equipped to streamline workflows, generate insightful reports, and enhance collaboration within your organization. Keep exploring the diverse functionalities of Fabric to continuously refine your Power BI experience and stay ahead in the realm of data analytics.Author bioArshad Ali is a principal product manager at Microsoft, working on the Microsoft Fabric product team in Redmond, WA. He focuses on Spark Runtime, which empowers both data engineering and data science experiences. In his previous role, he helped strategic customers and partners adopt Azure Synapse and Microsoft Fabric.Arshad has more than 20 years of industry experience and has been with Microsoft for over 16 years. He is the co-author of the book Big Data Analytics with Azure HDInsight and the author of over 200 technical articles and blogs on data and analytics. Arshad holds an MBA from the Foster School of Business at the University of Washington and an MCA from India.Bradley Schacht is a principal program manager on the Microsoft Fabric product team based in Saint Augustine, Florida. Bradley is a former consultant and trainer and has co-authored five books on SQL Server and Power BI. As a member of the Microsoft Fabric product team, Bradley works directly with customers to solve some of their most complex data problems and helps shape the future of Microsoft Fabric. Bradley gives back to the community by speaking at events, such as the PASS Summit, SQL Saturday, Code Camp, and user groups across the country, including locally at the Jacksonville SQL Server User Group (JSSUG). He is a contributor on SQLServerCentral and blogs on his personal site, BradleySchacht.

0
0
18462

article-image-generating-text-effects-with-adobe-firefly

Joseph Labrecque

02 Jul 2023

9 min read

Generating Text Effects with Adobe Firefly

Joseph Labrecque

02 Jul 2023

9 min read

Adobe Firefly Text EffectsAdobe Firefly is a new set of generative AI tools which can be accessed via https://firefly.adobe.com/ by anyone with an Adobe ID. To learn more about Firefly… have a look at their FAQ. Image 1: Adobe FireflyOne of the more unique aspects of Firefly that sets it apart from other generative AI tools is Adobe’s exploration of procedures that go beyond prompt-based image generation. A good example of this is what is called Text Effects in Firefly.Text effects are also prompt-based… but use a scaffold determined by font choice and character set to constrain a generated set of styles to these letterforms. The styles themselves are based on user prompts – although there are other variants to consider as well.In the remainder of this article, we will focus on the text-to-image basics available in Firefly.Using Text Effects within FireflyAs mentioned in the introduction, we will continue our explorations of Adobe Firefly with the ability to generate stylized text effects from a text prompt. This is a bit different from the procedures that users might already be familiar with when dealing with generative AI – yet retains many similarities with such processes.When you first enter the Firefly web experience, you will be presented with the various workflows available.Image 2: Firefly modules can be either active and ready to work with or in explorationThese appear as UI cards and present a sample image, the name of the procedure, a procedure description, and either a button to begin the process or a label stating that it is “in exploration”. Those which are in exploration are not yet available to general users.We want to locate the Text Effects module and click Generate to enter the experience.Image 3: The Text effects module in FireflyFrom there, you’ll be taken to a view that showcases text styles generated through this process. At the bottom of this view is a unified set of inputs that prompt you to enter the text string you want to stylize… along with the invitation to enter a prompt to “describe the text effects you want to generate”.Image 4: The text-to-image prompt requests your input to beginIn the first part that reads Enter Text, I have entered the text characters “Packt”. For the second part of the input requesting a prompt, enter the following: “futuristic circuitry and neon lighting violet”Click the Generate button when complete. You’ll then be taken into the Firefly text effects experience. Image 5: The initial set of four text effect variants is generated from your prompt with the characters entered used as a scaffoldWhen you enter the text effects module properly, you are presented in the main area with a preview of your input text which has been given a stylistic overlay generated from the descriptive prompt. Below this are a set of four variants, and below that are the text inputs that contain your text characters and the prompt itself.To the right of this are your controls. These are presented in a user-friendly way and allow you to make certain alterations to your text effects. We’ll explore these properties next to see how they can impact our text effect style.Exploring the Text Effect PropertiesAlong the right-hand side of the interface are properties that can be adjusted. The first section here includes a set of Sample prompts to try out.Image 6: A set of sample prompts with thumbnail displaysClicking on any of these sample thumbnails will execute the prompt attributed to it, overriding your original prompt. This can be useful for those new to prompt-building within Firefly to generate ideas for their own prompts and to witness the capabilities of the generative AI. Choosing the View All option will display even more prompts.Below the sample prompts, we have a very important adjustment that can be made in the form of Text effects fit.Image 7: Text effects fit determines how tight or loose the visuals are bound to the scaffoldThis section provides three separate options for you to choose from… Tight, Medium, or Loose. The default setting is Medium and choosing either of the other options will have the effect of either tightening up all the little visual tendrils that expand beyond the characters – or will let them loose, generating even more beyond the bounds of the scaffold.Let’s look at some examples with our current scaffold and prompt:Image 8: Tight - will keep everything bound within the scaffold of the chosen charactersImage 9: Medium - is the default and includes some additional visuals extending from the scaffoldImage 10: Loose - creates many visuals beyond the bounds of the scaffoldOne of the nice things about this set is that you can easily switch between them to compare the resulting images and make an informed decision.Next, we have the ability to choose a Font for the scaffold. There are currently a very limited set of fonts to use in Firefly. Similar to the sample prompts, choosing the View All option will display even more fonts.Image 11: The font selection propertiesWhen you choose a new font, it will regenerate the imagery in the main area of the Firefly interface as the scaffold must be rebuilt.I’ve chosen Source Sans 3 as the new typeface. The visual is automatically regenerated based on the new scaffold created from the character structure.Image 12: A new font is applied to our text and the effect is regeneratedThe final section along the right-hand side of the interface is for Color choices. We have options for Background Color and for Text Color. Image 13: Color choices are the final properties sectionThere are a very limited set of color swatches to choose from. The most important is whether you want to have the background of the generated image be transparent or not.Making Additional ChoicesOkay – we’ll now look to making final adjustments to the generated image and downloading the text effect image to our local computer. The first thing we’ll choose is a variant – which can be found beneath the main image preview. A set of 4 thumbnail previews are available to choose from.Image 14: Selecting from the presented variantsClicking on each will change the preview above it to reveal the full variant – as applied to your text effect.For instance, if I choose option #3 from the image above, the following changes would result:Image 15: A variant is selected and the image preview changes to matchOf course, if you do not like any of the alternatives, you can always choose the initial thumbnail to revert back.Once you have made the choice of variant, you can download the text effect as an image file to your local file system for use elsewhere. Hover over the large preview image and an options overlay appears.Image 16: A number of options appear in the hover overlay, including the download optionWe will explore these additional options in greater detail in a future article. Click the download icon to begin the download process for that image.As Firefly begins preparing the image for download, a small overlay dialog appears.Image 17: Content credentials are applied to the image as it is downloadedFirefly applies metadata to any generated image in the form of content credentials and the image download process begins.What are content credentials? They are driven as part of the Content Authenticity Initiative to help promote transparency in AI. This is how Adobe describes content credentials in their Firefly FAQ:Content Credentials are sets of editing, history, and attribution details associated with content that can be included with that content at export or download. By providing extra context around how a piece of content was produced, they can help content producers get credit and help people viewing the content make more informed trust decisions about it. Content Credentials can be viewed by anyone when their respective content is published to a supporting website or inspected with dedicated tools. -- AdobeOnce the image is downloaded, it can be viewed and shared just like any other image file.Image 18: The text effect image is downloaded and ready for useAlong with content credentials, a small badge is placed upon the lower right of the image which visually identifies the image as having been produced with Adobe Firefly (beta).There is a lot more Firefly can do, and we will continue this series in the coming weeks. Keep an eye out for an Adobe Firefly deep dive… exploring additional options for your generative AI creations!Author BioJoseph is a Teaching Assistant Professor, Instructor of Technology, University of Colorado Boulder / Adobe Education Leader / Partner by DesignJoseph Labrecque is a creative developer, designer, and educator with nearly two decades of experience creating expressive web, desktop, and mobile solutions. He joined the University of Colorado Boulder College of Media, Communication, and Information as faculty with the Department of Advertising, Public Relations, and Media Design in Autumn 2019. His teaching focuses on creative software, digital workflows, user interaction, and design principles and concepts. Before joining the faculty at CU Boulder, he was associated with the University of Denver as adjunct faculty and as a senior interactive software engineer, user interface developer, and digital media designer.Labrecque has authored a number of books and video course publications on design and development technologies, tools, and concepts through publishers which include LinkedIn Learning (Lynda.com), Peachpit Press, and Adobe. He has spoken at large design and technology conferences such as Adobe MAX and for a variety of smaller creative communities. He is also the founder of Fractured Vision Media, LLC; a digital media production studio and distribution vehicle for a variety of creative works.Joseph is an Adobe Education Leader and member of Adobe Partners by Design. He holds a bachelor’s degree in communication from Worcester State University and a master’s degree in digital media studies from the University of Denver.Author of the book: Mastering Adobe Animate 2023

0
0
18080

article-image-implementing-a-rag-enhanced-cookbot-part-2

Bahaaldine Azarmi, Jeff Vestal

17 Jan 2024

7 min read

Implementing a RAG-enhanced CookBot - Part 2

Bahaaldine Azarmi, Jeff Vestal

17 Jan 2024

7 min read

Dive deeper into the world of AI innovation and stay ahead of the AI curve! Subscribe to our AI_Distilled newsletter for the latest insights. Don't miss out – sign up today!This article is an excerpt from the book, Vector Search for Practitioners with Elastic, by Bahaaldine Azarmi and Jeff Vestal. Optimize your search capabilities in Elastic by operationalizing and fine-tuning vector search and enhance your search relevance while improving overall search performanceIntroductionEmbark on a culinary journey where innovation meets tradition, as we delve into the intricacies of recipe retrieval with the dynamic trio of BM25, ELSER, and RRF. In the ever-evolving landscape of cooking queries, this article unveils the power of classic full-text search, semantic understanding through ELSER, and robust rank fusion with RRF. Witness the synergy of these techniques as they seamlessly integrate into the RAG system, combining the efficiency of an Elasticsearch retriever with the creativity of the GPT-4 generator. Join us to explore how this transformative blend goes beyond keywords, shaping a personalized, flavorful experience for culinary enthusiasts.You can find Implementing a RAG-enhanced CookBot - Part 1 here.Building the retriever—RRF with ELSERIn the context of our recipe retrieval task, our goal is to maximize the relevance of the returned recipes based on a user’s query. We will utilize a combination of classic full-text search (via BM25), semantic search (with ELSER), and a robust rank fusion method (RRF). This combination allows us to handle more complex queries and return results that align closely with the user’s intent.Let’s consider the following query:GET recipes/_search { "_source": { "includes": [ "name", "ingredient" ] }, "sub_searches": [ { "query": { "bool": { "must": { "match": { "ingredient": "carrot beef" } }, "must_not": { "match": { "ingredient": "onion" } } } } }, { "query": { "text_expansion": { "ml.tokens": { "model_id": ".elser_model_1", "model_text": "I want a recipe from the US west coast with beef" } } } } ], "rank": { "rrf": { "window_size": 50, "rank_constant": 20 } } }This query includes two types of search. The first uses a classic Elasticsearch Boolean search to find recipes that contain both carrot and beef as ingredients, excluding those with onion. This traditional approach ensures that the most basic constraints of the user are met.The second sub_search employs ELSER to semantically expand the query I want a recipe from the US west coast with beef. ELSER interprets this request based on its understanding of language, enabling the system to match documents that may not contain the exact phrase but are contextually related. This allows the system to factor in the more nuanced preferences of the user.The query then employs RRF to combine the results of the two sub_searches. The window_ size parameter is set to 50, meaning the top 50 results from each sub-search are considered. The rank_constant parameter, set to 20, guides the RRF algorithm to fuse the scores from the two sub_searches.Thus, this query exemplifies the effective combination of BM25, ELSER, and RRF. Exploiting the strengths of each allows CookBot to move beyond simple keyword matches and provide more contextually relevant recipes, improving the user experience and increasing the system’s overall utility.Leveraging the retriever and implementing the generatorNow that we have our Elasticsearch retriever set up and ready to go, let’s proceed with the final part of our RAG system: the generator. In the context of our application, we’ll use the GPT-4 model as the generator. We’ll implement the generator in our recipe_generator.py module and then integrate it into our Streamlit application.Building the generatorWe will start by creating a RecipeGenerator class. This class is initialized with an OpenAI API key(find out how to get an OpenAI key at https://help.openai.com/en/articles/4936850where-do-i-find-my-secret-api-key), which is used to authenticate our requests with the GPT-4 model:import openai import json from config import OPENAI_API_KEY class RecipeGenerator: def __init__(self, api_key): self.api_key = api_key openai.api_key = self.api_key Next, we define the generate function in the RecipeGenerator class. This function takes in a recipe as input, and sends it as a prompt to the GPT-4 model, asking it to generate a detailed, step-by-step guide. def generate(self, recipe): prompts = [{"role": "user", "content": json.dumps(recipe)}] instruction = { "role": "system", "content": "Take the recipes information and generate a recipe with a mouthwatering intro and a step by step guide." } prompts.append(instruction) generated_content = openai.ChatCompletion.create( model="gpt-4", messages=prompts, max_tokens=1000 ) return generated_content.choices[0].message.contentThe prompts are formatted as required by the OpenAI API, and the max_tokens parameter is set to 1000 to limit the length of the generated text. The generated recipe is then returned by the function.Integrating the generator into the Streamlit applicationWith our RecipeGenerator class ready, we can now integrate it into our Streamlit application in main.py. After importing the necessary modules and initializing the RecipeGenerator class, we will set up the user interface with a text input field:from recipe_generator import RecipeGenerator from config import OPENAI_API_KEY generator = RecipeGenerator(OPENAI_API_KEY) input_text = st.text_input(" ", placeholder="Ask me anything about cooking")When the user enters a query, we will use the Elasticsearch retriever to get a relevant recipe. We then pass this recipe to the generate function of RecipeGenerator, and the resulting text is displayed in the Streamlit application (see a video example at https://www.linkedin.com/posts/ bahaaldine_genai-gpt4-elasticsearch-activity-7091802199315394560-TkPY):if input_text: query = { "sub_searches": [ { "query": { "bool": { "must_not": [ { "match": { "ingredient": "onion" } } ] } } }, { "query": { "text_expansion": { "ml.tokens": { "model_id": ".elser_model_1", "model_text": input_text } } } } ], "rank": { "rrf": { "window_size": 50, "rank_constant": 20 } } } recipe = elasticsearch_query(query) st.write(recipe) st.write(generator.generate(recipe))The generator thus works in tandem with the retriever to provide a detailed, step-by-step recipe based on the user’s query. This completes our implementation of the RAG system in a Streamlit application, bridging the gap between retrieving relevant information and generating coherent, meaningful responses.ConclusionIn conclusion, the marriage of BM25, ELSER, and RRF marks a groundbreaking approach to recipe retrieval, reshaping the culinary landscape. The strategic amalgamation of classic search methodologies, semantic comprehension, and robust rank fusion ensures a tailored and enriched user experience. As we bid farewell to this exploration, it's evident that the RAG system, with its Elasticsearch retriever and GPT-4 generator, successfully bridges the gap between information retrieval and creative recipe generation. This synergistic blend not only meets user expectations but surpasses them, offering a harmonious fusion of precision and creativity in the realm of culinary exploration.Author BioBahaaldine Azarmi, Global VP Customer Engineering at Elastic, guides companies as they leverage data architecture, distributed systems, machine learning, and generative AI. He leads the customer engineering team, focusing on cloud consumption, and is passionate about sharing knowledge to build and inspire a community skilled in AI.Jeff Vestal has a rich background spanning over a decade in financial trading firms and extensive experience with Elasticsearch. He offers a unique blend of operational acumen, engineering skills, and machine learning expertise. As a Principal Customer Enterprise Architect, he excels at crafting innovative solutions, leveraging Elasticsearch's advanced search capabilities, machine learning features, and generative AI integrations, adeptly guiding users to transform complex data challenges into actionable insights.

0
0
17541

article-image-generative-ai-in-natural-language-processing

Jyoti Pathak

22 Nov 2023

9 min read

Generative AI in Natural Language Processing

Jyoti Pathak

22 Nov 2023

9 min read

0
0
17415

article-image-generative-ai-with-complementary-ai-tools

Jyoti Pathak

07 Nov 2023

9 min read

Generative AI with Complementary AI Tools

Jyoti Pathak

07 Nov 2023

9 min read

Dive deeper into the world of AI innovation and stay ahead of the AI curve! Subscribe to our AI_Distilled newsletter for the latest insights. Don't miss out – sign up today!IntroductionGenerative AI tools have emerged as a groundbreaking technology, paving the way for innovation and creativity across various domains. Understanding the nuances of generative AI and its integration with adaptive AI tools is essential in unlocking its full potential. Generative AI, a revolutionary concept, stands tall among these innovations, enabling machines not just to replicate patterns from existing data but to generate entirely new and creative content. Combined with complementary AI tools, this technology reaches new heights, reshaping industries and fueling unprecedented creativity. SourceConcept of Generative AI ToolsGenerative AI tools encompass artificial intelligence systems designed to produce new, original content based on patterns learned from existing data. These tools employ advanced algorithms such as Generative Adversarial Networks (GANs) and Variational Autoencoders (VAEs) to create diverse outputs, including text, images, videos, and more. Their ability to generate novel content makes them invaluable in creative fields and scientific research.Difference between Generative AI and Adaptive AIWhile generative AI focuses on creating new content, adaptive AI adjusts its behavior based on the input it receives. Generative AI is about generating something new, whereas adaptive AI learns from interactions and refines its responses over time.Generative AI in ActionGenerative AI's essence lies in creating new, original content. Consider a practical example of image synthesis using a Generative Adversarial Network (GAN). GANs comprise a generator and a discriminator, engaged in a competitive game where the generator creates realistic images to deceive the discriminator. Here's a Python code snippet showcasing a basic GAN implementation using TensorFlow:import tensorflow as tf from tensorflow.keras.models import Sequential from tensorflow.keras.layers import Dense, Reshape from tensorflow.keras.layers import LeakyReLU # Define the generator model generator = Sequential() generator.add(Dense(128, input_shape=(100,))) generator.add(LeakyReLU(0.2)) generator.add(Reshape((28, 28, 1))) # Define the discriminator model (not shown for brevity) # Compile the generator generator.compile(loss='binary_crossentropy', optimizer='adam')In this code, the generator creates synthetic images based on random noise (a common practice in GANs). Through iterations, the generator refines its ability to produce images resembling the training data, showcasing the creative power of Generative AI.Adaptive AI in PersonalizationAdaptive AI, conversely, adapts to user interactions, providing tailored experiences. Let's explore a practical example of building a simple recommendation system using collaborative filtering, an adaptive AI technique. Here's a Python code snippet using the Surprise library for collaborative filtering:from surprise import Dataset from surprise import Reader from surprise.model_selection import train_test_split from surprise import SVD from surprise import accuracy # Load your data into Surprise Dataset reader = Reader(line_format='user item rating', sep=',') data = Dataset.load_from_file('path/to/your/data.csv', reader=reader) # Split data into train and test sets trainset, testset = train_test_split(data, test_size=0.2) # Build the SVD model (Matrix Factorization) model = SVD() model.fit(trainset) # Make predictions on the test set predictions = model.test(testset) # Evaluate the model accuracy.rmse(predictions)In this example, the Adaptive AI model learns from user ratings and adapts to predict new ratings. By tailoring recommendations based on individual preferences, Adaptive AI enhances user engagement and satisfaction.Generative AI sparks creativity, generating new content such as images, music, or text, as demonstrated through the GAN example. Adaptive AI, exemplified by collaborative filtering, adapts to user behavior, personalizing experiences and recommendations.By understanding and harnessing both Generative AI and Adaptive AI, developers can create innovative applications that not only generate original content but also adapt to users' needs, paving the way for more intelligent and user-friendly AI-driven solutions.Harnessing Artificial IntelligenceHarnessing AI involves leveraging its capabilities to address specific challenges or achieve particular goals. It requires integrating AI algorithms and tools into existing systems or developing new applications that utilize AI's power to enhance efficiency, accuracy, and creativity.Harnessing the power of Generative AI involves several essential steps, from selecting the suitable model to training and generating creative outputs. Here's a breakdown of the steps, along with code snippets using Python and popular machine-learning libraries like TensorFlow and PyTorch:Step 1: Choose a Generative AI ModelSelect an appropriate Generative AI model based on your specific task. Standard models include Variational Autoencoders (VAEs), Generative Adversarial Networks (GANs), and Transformers like OpenAI's GPT (Generative Pre-trained Transformer).Step 2: Prepare Your DataPrepare a dataset suitable for your task. For example, if you're generating images, ensure your dataset contains a diverse range of high-quality images. If you're generating text, organize your textual data appropriately.Step 3: Preprocess the DataPreprocess your data to make it suitable for training. This might involve resizing images, tokenizing text, or normalizing pixel values. Here's a code snippet demonstrating image preprocessing using TensorFlow:from tensorflow.keras.preprocessing.image import ImageDataGenerator # Image preprocessing datagen = ImageDataGenerator(rescale=1./255) train_generator = datagen.flow_from_directory( 'path/to/your/dataset', target_size=(64, 64), batch_size=32, class_mode='binary' )Step 4: Build and Compile the Generative ModelBuild your Generative AI model using the chosen architecture. Compile the model with an appropriate loss function and optimizer. For example, here's a code snippet to create a basic generator model using TensorFlow:import tensorflow as tf from tensorflow.keras.models import Sequential from tensorflow.keras.layers import Dense, Reshape, LeakyReLU # Generator model generator = Sequential() generator.add(Dense(128, input_shape=(100,))) generator.add(LeakyReLU(0.2)) generator.add(Reshape((64, 64, 3))) # Adjust dimensions based on your taskStep 5: Train the Generative ModelTrain your Generative AI model using the prepared dataset. Adjust the number of epochs, batch size, and other hyperparameters based on your specific task and dataset. Here's a code snippet demonstrating model training using TensorFlow:# Compile the generator model generator.compile(loss='mean_squared_error', optimizer='adam') # Train the generator generator.fit(train_generator, epochs=100, batch_size=32)Step 6: Generate Creative OutputsOnce your Generative AI model is trained, you can generate creative outputs. For images, you can generate new samples. For text, you can generate paragraphs or even entire articles. Here's a code snippet to generate images using the trained generator model:# Generate new images import matplotlib.pyplot as plt # Generate random noise as input random_noise = tf.random.normal(shape=[1, 100]) # Generate image generated_image = generator(random_noise, training=False) # Display the generated image plt.imshow(generated_image[0, :, :, :]) plt.axis('off') plt.show()By following these steps and adjusting the model architecture and hyperparameters according to your specific task, you can delve into the power of Generative AI to create diverse and creative outputs tailored to your requirements. SourceExample of Generative AIOne prominent example of generative AI is DeepArt, an online platform that transforms photographs into artworks inspired by famous artists’ styles. DeepArt utilizes neural networks to analyze the input image and recreate it in the chosen artistic style, demonstrating the creative potential of generative AI.Positive Uses and Effects of Generative AIGenerative AI has found positive applications in various fields. In healthcare, it aids in medical image synthesis, generating detailed and accurate images for diagnostic purposes. In the entertainment industry, generative AI is utilized to create realistic special effects and animations, enhancing the overall viewing experience. Moreover, it facilitates rapid prototyping in product design, allowing for the generation of diverse design concepts efficiently.Most Used and Highly Valued Generative AIAmong the widely used generative AI technologies, OpenAI's GPT (Generative Pre-trained Transformer) stands out. Its versatility in generating human-like text has made it a cornerstone in natural language processing tasks. Regarding high valuation, NVIDIA's StyleGAN, a GAN-based model for developing lifelike images, has garnered significant recognition for its exceptional output quality and flexibility.Code Examples:To harness the power of generative AI with complementary AI tools, consider the following Python code snippet using TensorFlow:import tensorflow as tf from tensorflow.keras.models import Sequential from tensorflow.keras.layers import Dense, Reshape from tensorflow.keras.layers import LeakyReLU # Define the generative model generator = Sequential() generator.add(Dense(128, input_shape=(100,))) generator.add(LeakyReLU(0.2)) generator.add(Reshape((28, 28, 1))) # Compile the model generator.compile(loss='binary_crossentropy', optimizer='adam') # Generate synthetic data random_noise = tf.random.normal(shape=[1, 100]) generated_image = generator(random_noise, training=False)ConclusionGenerative AI, with its ability to create novel content coupled with adaptive AI, opens doors to unparalleled possibilities. By harnessing the power of these technologies and integrating them effectively, we can usher in a new era of innovation, creativity, and problem-solving across diverse industries. As we continue to explore and refine these techniques, the future holds endless opportunities for transformative applications in our rapidly advancing world.Author BioJyoti Pathak is a distinguished data analytics leader with a 15-year track record of driving digital innovation and substantial business growth. Her expertise lies in modernizing data systems, launching data platforms, and enhancing digital commerce through analytics. Celebrated with the "Data and Analytics Professional of the Year" award and named a Snowflake Data Superhero, she excels in creating data-driven organizational cultures.Her leadership extends to developing strong, diverse teams and strategically managing vendor relationships to boost profitability and expansion. Jyoti's work is characterized by a commitment to inclusivity and the strategic use of data to inform business decisions and drive progress.

0
0
17176

article-image-generative-ai-and-data-storytelling

Ryan Goodman

01 Nov 2023

11 min read

Generative AI and Data Storytelling

Ryan Goodman

01 Nov 2023

11 min read

Dive deeper into the world of AI innovation and stay ahead of the AI curve! Subscribe to our AI_Distilled newsletter for the latest insights. Don't miss out – sign up today!IntroductionAs data volumes grow exponentially, converting vast amounts of data into digestible, human-friendly "information assets" has become a blend of art and science. This process requires understanding the data's context, the intended objective, the target audience. The final step of curating and converting information assets into a cohesive narrative that results in human understanding and persuasion is known as “Data Storytelling.”The Essence of Data StorytellingCrafting a compelling data story means setting a clear objective, curating information, and narrating relevant information assets. Good stories communicate effectively, leaving a lasting impression, and in some cases, sway opinions or reinforce previously accepted knowledge. A common tactic to tell an influential data story typically contains quantitative facts and statistics.Modern enterprises utilize a myriad of techniques to convert accumulated data into valuable information assets. The vast amounts of analytical and qualitative information assets get distributed in the form of reports and dashboards. These assets are informational to support process automation, decision support, and facilitate future predictions. Many curated stories manifest as summarized text and charts, often taking the form of a document, spreadsheet, or presentation.The Potential of Generative AIWhile analytics and machine learning are adept at tackling large-scale computational challenges, the growing demand for curated explanations has elevated data storytelling to an important discipline. However, the number of skilled data storytellers remains limited due to the unique blend of technical and soft skills required. This has led to a high demand for quality data stories and a scarcity of adept professionals. Generative AI provides a promising solution by automating many of the data storytelling steps and simulating human interpretation and comprehension.Crafting Dynamic Narratives with AIConstructing a coherent narrative from data is intricate. However, the emergence of powerful generative AI models, such as BARD and ChatGPT, offers hope. These platforms, initially popularized by potent large language models, have recently incorporated computer vision to interpret and generate visual data. Given the right context and curated data, these AI models seem ready to address several challenges of data storytelling.Presently, publicly available generative AI models can produce code-based data analysis, interpret and present findings, experiment with story structures, extract insights, and customize messages based on user personas. Generative AI isn’t adapted to prepare complete visual stories with quantitative data. Merging and chaining multiple AI agents will open the door to translating and delivering visual stories.The essence of data storytelling lies in narration, and occasionally, in persuasion or creative representation of information. Unmonitored nuances in AI-generated data narratives can be misleading or flat-out incorrect. There is no scenario where data storytelling should go un-aided by a human reviewer.How Generative AI Can Aid Traditional Descriptive Analytics StoriesVarious data analysis techniques, including scatter plots, correlation mattresses, and histograms, help unveil the relationships and patterns within the data. Identifying trends and visualizing the data distribution process, the exploratory data analysis technique guides in selecting the appropriate analysis method. Large language models are not designed to perform computational and quantitative analysis. However, they are quite proficient at generating coding capable of and translating questions to code into quantitative analysis only when context and data are well defined.Understanding Data ContextBefore choosing any technique to generate narratives, it's essential to understand the context of a dataset. Delving deep into domain-specific nuances ensures narratives that are both focused and relevant.● Preprocessing and Data CleaningClean and well-processed data form the bedrock of impactful analyses. Both human-built data stories and AI-augmented narratives falter with poor data. Advancements in data management, governance, and descriptive analytics are pivotal in paving the way for superior AI-enhanced stories. Proper preprocessing and quality data are prerequisites for crafting reliable narratives.● Exploratory Data Analysis TechniquesData exploration without context and understanding of the data is challenging. However, with a very good dictionary, business glossary, and good data large language models can easily write code capable of streamlining basic data exploration with a human in the loop steering the process.● Machine Learning with Advanced AnalyticsTo create dynamic and impactful narratives, it's vital to employ a range of advanced analytical methods, such as regression analysis and clustering. These techniques help unearth detailed insights. Similarly, Generative AI can efficiently conduct basic statistical analyses. It can also deploy machine learning algorithms like decision trees and random forests to discern intricate patterns. Weaving these insights into narrative techniques imparts a layer of sophistication and depth to the storytelling.● Interactive Data VisualizationInteractive visualization tools enable users to dynamically explore data. Traditionally, this area has been predominantly the domain of human analysis. While machines can interpret data patterns without visual aids, rapid advancements in computer vision and generative AI are bridging the gap. Today's AI can interpret visual representations of data, deriving meaning from visualizations. It's becoming increasingly common for dashboards to be reviewed and summarized by AI, a trend likely to continue.ExampleHere is a simple example in action where we have a data insight that we want to translate to a simple infographic:Data Insight: Our customer demographics include 45% millennials, 30% Gen X, and 25% baby boomers.Prompt: Create a visual representation to tell the story via wolframResult: The result with Wolram enabled is a conversion of my prompt and data insights into code that can be translated to a visual infographic REQUEST TO WOLFRAM{ "input": "BarChart[{45, 30, 25}, ChartLabels -> {\"Millennials\", \"Gen X\", \"Baby Boomers\"}, ChartStyle -> {RGBColor[0, 0.5, 1], RGBColor[0, 0.8, 0], RGBColor[1, 0.5, 0]}, PlotLabel -> \"Customer Demographics\", AxesLabel -> {\"Demographic\", \"Percentage\"}]" }RESPONSE FROM WOLFRAM{ "output": "https://www.wolframcloud.com/obj/86f3ba22-6a46-412d-88e1-b90064001407" }How Generative AI Can Aid Narrative CreationNatural language processing One of the techniques to generate dynamic narratives is the implementation of natural language processing. The technique helps in extracting insights from contextual data. Sometimes, sentiment analysis and topic modeling help in processing textual information while changing the outlook of unstructured data into valuable narratives. At the same time, integrating the same analysis can increase understanding of the narrative.Data storytellingAs a user, if you want to draft a compelling data story, you need to structure the narrative coherently. Starting with a captivating introduction accompanied by data-driven insights and their implications enhances the method. Such compelling storytelling ensures that the focal message is conveyed with clarity.Data Insight: Our website traffic increased by 25% after the recent marketing campaign. Sales of Product A saw a significant boost, contributing to 40% of total revenue.Prompt: Create a visual representation to tell the story via wolfram and a data-driven story based on the following insightsResult:REQUEST TO WOLFRAM:{ "input": "GraphicsRow[{BarChart[{100, 125}, ChartLabels -> {\"Before Campaign\", \"After Campaign\"}, ChartStyle -> {Blue, Green}, AxesLabel -> {\"\", \"Traffic\"}, PlotLabel -> \"Website Traffic\"], PieChart[{40, 60}, ChartLabels -> {\"Product A\", \"Other Products\"}, ChartStyle -> {Red, Yellow}, PlotLabel -> \"Revenue Contribution\"]}]" }RESPONSE FROM WOLFRAM:{ "output": "https://www.wolframcloud.com/obj/7221f424-771c-4a02-997b-f9bf8fade843" }Narrative: "After launching our recent marketing campaign, we witnessed a remarkable surge in our website traffic. The numbers speak for themselves: a significant 25% increase! This not only brought more visitors to our platform but also translated into tangible business outcomes. A standout performer was Product A, which became a major revenue driver. It alone accounted for a whopping 40% of our total revenue, highlighting its immense popularity among our customers. The campaign not only amplified our reach but also reinforced the market's love for Product A. It's evident that our marketing efforts and Product A together created a winning combination for our business."How can AI-aided storytelling drive operations efficiently?AI-aided storytelling methods don’t always require creating information assets from raw data. Generative AI can assist in translating previously created information assets or summaries into the new explorative analyses.Data Insight: In the last quarter, our sales increased by 20%. Customer engagement on social media platforms grew by 35%.Prompt: How can we summarize the information to tell a good data story? What are some additional insights we should investigate?Result from Large Language Model:Additional Insights to Investigate: 1. Source of Traffic and Sales: Which social media platforms contributed the most to the increased engagement? Are there specific platforms that are driving more sales compared to others? 2. Content Analysis: What type of content (videos, articles, images, etc.) received the most engagement? Were there specific campaigns or promotions that were particularly successful? 3. Customer Demographics: Has there been a change in the demographics of the customers who are purchasing? Are younger or older age groups more engaged on social media?From the above prompt and resulting information it is clear how generative AI helps in crafting the story and further exploring complex information relationships in ways that previously required expertise. As a result, large lanague models in the exploration phase can help:Improve the exploration ideas and hypothesis.Creating quck infographicsAutomate the manual process of narrative creation and summarization.Insights of leveraging AI for more effective data communication There can be various ways of leveraging artificial intelligence to incorporate data communication effectively. These are the top areas where generative AI can help boost the operation.Streamlined Unstructured Data AnalysisLarge language models are very good at ingesting and interpreting bodies of text and interpreting and extracting summarized facts. The latest generative AI tools, including OpenAI, now support interpreting large documents including PDFs and spreadsheets as large as 100MB. For example, you can potentially import and interpret an annual statement and summarize it. For large data sets representing many documents or gigabytes of data, there are a number of ingest, indexing, and large language model integration techniques to effectively expand the body of knowledge and context your model has to formulate data stories.Advancing analyticsDemocratization of data science is further expanded thanks to large language models’ ability to interpret and write code, notably Python. Using code interpreter, we can easily feed AI tens or hundreds of thousands of data points and predict the most predictive variables that determine a target variable. For example, after importing 50,000 records and feeding the following prompt a modern AI model can automatically perform steps traditionally carried out by a data analyst or data scientist. To successfully use this method, you still need to understand and interpret results and direct any miscalculations. For example, the AI model will not help you understand variables that overfit.The steps we'll follow are:1. Data Preprocessing:Handle missing values.Encode categorical variables.Split the dataset into training and testing sets.2. Train a model (e.g., random forest) using the data.3. Extract feature importance to determine the most predictive variables.Resulting code example: # Redefining the RandomForestClassifier rf = RandomForestClassifier(n_estimators=100, random_state=42) # Training a RandomForest model rf.fit(X_train, y_train) # Extracting feature importances feature_importances = rf.feature_importances_ sorted_idx = feature_importances.argsort()[-10:][::-1] # We'll display the top 10 features top_features = X.columns[sorted_idx] top_importances = feature_importances[sorted_idx] top_features, top_importancesResulting code example:Based on the Random Forest model, the top 10 most predictive variables for "Bad" are…ConclusionThere are numerous opportunities to utilize generative AI to aid data storytelling. While the techniques covered in this article help streamline the process, there is no magic easy button to automate data storytelling end to end. In the world of data storytelling, AI is a perfect augmentation tool to craft a narrative, perform analysis, and inspire new ideas. However, the competitive and disruptive world of AI is already merging computer vision, code interpretation, and generative AI further opening the door for new techniques. As images and text converge with code interpretation, we should see new distributive storytelling tools emerge. The most important ingredient is always the data itself. I am excited to continue to explore and experiment with these advancements to streamline my data storytelling abilities.Author BioRyan Goodman has dedicated 20 years to the business of data and analytics, working as a practitioner, executive, and entrepreneur. He recently founded DataTools Pro after 4 years at Reliant Funding, where he served as the VP of Analytics and BI. There, he implemented a modern data stack, utilized data sciences, integrated cloud analytics, and established a governance structure. Drawing from his experiences as a customer, Ryan is now collaborating with his team to develop rapid deployment industry solutions. These solutions utilize machine learning, LLMs, and modern data platforms to significantly reduce the time to value for data and analytics teams.

0
0
17076

How-To Tutorials - AI Tools

ChatGPT on Your Hardware: GPT4All

AI Distilled 35: Building LLMs, Rust Inference, SpeechGPT-Gen, and 3D

BI-Pro#49: Microsoft Fabric Lifecycle Management, Data Factory Adds CI/CD to Fabric Data Pipelines, Database Mirroring, AWS Well-Architected Data Analytics Lens

Implementing a RAG-enhanced CookBot - Part 1

Leveraging Google Cloud for Custom Endpoint with OpenAI

AI_Distilled 37: Cutting-Edge Updates and Expert Guidance

The Transformative Potential of Google Bard in Healthcare

Writing improvised unit test cases with GitHub Copilot

Streamlining Insights with Microsoft Copilot

Get Started with Fabric: Create Your Workspace & Reports

Trending Topics

Generating Text Effects with Adobe Firefly

Implementing a RAG-enhanced CookBot - Part 2

Generative AI in Natural Language Processing

Generative AI with Complementary AI Tools

Generative AI and Data Storytelling

Create a Free Account To Continue Reading

SignIn Free Account To Continue Reading