Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Save more on your purchases! discount-offer-chevron-icon
Savings automatically calculated. No voucher code required.
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Events
Videos
Audiobooks
Packt Hub
Free Learning
Arrow right icon
timer SALE ENDS IN
0 Days
:
00 Hours
:
00 Minutes
:
00 Seconds

How-To Tutorials - ChatGPT

108 Articles
article-image-summarizing-data-with-openai-chatgpt
Greg Beaumont
02 Jun 2023
4 min read
Save for later

Summarizing Data with OpenAI ChatGPT

Greg Beaumont
02 Jun 2023
4 min read
This article is an excerpt from the book, Machine Learning with Microsoft Power BI, by Greg Beaumont. This book is designed for data scientists and BI professionals seeking to improve their existing solutions and workloads using AI. In the ever-expanding landscape of data analysis, the ability to summarize vast amounts of information concisely and accurately is invaluable. Enter ChatGPT, an advanced AI language model developed by OpenAI. In this article, we delve into the realm of data summarization with ChatGPT, exploring how this powerful tool can revolutionize the process of distilling complex datasets into concise and informative summaries.Numerous databases feature free text fields that comprise entries from a diverse array of sources, including survey results, physician notes, feedback forms, and comments regarding incident reports for the FAA Wildlife Strike database that we have used in this book. These text entry fields represent a wide range of content, from structured data to unstructured data, making it challenging to extract meaning from them without the assistance of sophisticated natural language processing tools. The Remarks field of the FAA Wildlife Strike database contains text that was presumably entered by people involved in filling out the incident form about an aircraft striking wildlife. A few examples of the remarks for recent entries are shown in Power BI in the following screenshot: Figure 1 – Examples of Remarks from the FAA Wildlife Strike Database You will notice that the remarks have a great deal of variability in the format of the content, the length of the content, and the acronyms that were used. Testing one of the entries by simply adding a statement at the beginning to “Summarize the following:” yields the following result: Figure 2 – Summarizing the remarks for a single incident using ChatGPT Summarizing data for a less detailed Remarks field yields the following results: Figure 3 – Summarization of a sparsely populated results field In order to obtain uniform summaries from the FAA Wildlife Strike data's Remarks field, one must consider entries that vary in robustness, sparsity, completeness of sentences, and the presence of acronyms and quick notes. The workshop accompanying this technical book is your chance to experiment with various data fields and explore diverse outcomes. Both the book and the Packt GitHub site will utilize a standardized format as input to a GPT model that can incorporate event data and produce a consistent summary for each row. An example of the format is as follows:  Summarize the following in three sentences: A [Operator] [Aircraft] struck a [Species]. Remarks on the FAA report were: [Remarks]. Using data from an FAA Wildlife Strike Database event to test this approach in OpenAI ChatGPT is shown in the following screenshot: Figure 4 – OpenAI ChatGPT testing a summarization of the remarks field Next, you test another scenario that had more robust text in the Remarks field: Figure 5 – Another scenario with robust remarks tested using OpenAI ChatGPT SummaryThis article explores how ChatGPT can revolutionize the process of condensing complex datasets into concise and informative summaries. By leveraging its powerful language generation capabilities, ChatGPT enables researchers, analysts, and decision-makers to quickly extract key insights and make informed decisions. Dive into the world of data summarization with ChatGPT and unlock new possibilities for efficient data analysis and knowledge extraction. Author Bio:Greg Beaumont is a Data Architect at Microsoft; Greg is an expert in solving complex problems and creating value for customers. With a focus on the healthcare industry, Greg works closely with customers to plan enterprise analytics strategies, evaluate new tools and products, conduct training sessions and hackathons, and architect solutions that improve the quality of care and reduce costs. With years of experience in data architecture and a passion for innovation, Greg has a unique ability to identify and solve complex challenges. He is a trusted advisor to his customers and is always seeking new ways to drive progress and help organizations thrive. For more than 15 years, Greg has worked with healthcare customers who strive to improve patient outcomes and find opportunities for efficiencies. He is a veteran of the Microsoft data speaker network and has worked with hundreds of customers on their data management and analytics strategies.You can follow Greg on LinkedIn
Read more
  • 0
  • 0
  • 17333

article-image-data-cleaning-made-easy-with-chatgpt
Sagar Lad
02 Jun 2023
5 min read
Save for later

Data Cleaning Made Easy with ChatGPT

Sagar Lad
02 Jun 2023
5 min read
Identifying inconsistencies and inaccuracies in the data is a vital part of the data analysis process. ChatGPT is a natural language processing tool powered by AI that enables users to have human-like conversations and helps them complete tasks quickly. In this article, we'll focus on how chatGPT can make the process of data cleansing and cleaning more efficient. Data Cleansing/Cleaning with ChatGPT Given the volume, velocity, and variety of data we deal with nowadays, manually carrying out the data cleansing task is a very time-consuming process. Data cleansing, the removal of duplicate data, data validity, uniqueness, consistency, and correctness are all steps taken to increase the quality of the data. Better business insights and the ability for business users to make wise decisions are provided by cleansed data. Data cleansing activities go via a series of steps, starting with gathering the data and ending with integrating, producing, and normalizing the data, as shown in the image below: Image 1: Data cleansing cycle  The majority of corporate organizations carry out the following tasks as part of the exploratory data analysis's data cleansing procedure: Identify and clean up Duplicate Values Fill Null Values with a default valueRectify and Correct inconsistent dataStandardising date formats Standardising  name or addressArea codes out of phone numbersFlattening nested data structuresErasing incomplete dataDetecting conflicts in the database The strength of ChatGPT allows us to perform time-consuming and extremely boring tasks like data purification with ease. Let's use the example of employee details for the banking industry to better comprehend it which has columns: Employee ID, Employee Name, Department Name, and Joining Date. While reviewing the data, we discovered a number of data quality concerns that must be resolved before we can truly use this data for analytics. Example: Employee Name is inconsistent - some instances use lowercase while others use uppercase letters. The data format is not uniform for the joining date column. Traditional Way of Working To clean up this data in Excel, we must manually construct the formulas and apply functions like TRIM, UPPER, or LOWER before using it for analytics. It calls for development work, and upkeep of Excel logic without version control, history, etc. Sounds extremely tedious, isn’t it? Working with ChatGPT We can utilize ChatGPT to automate the aforementioned data purification operation by implementing some Python code. In this example, we'll use the ChatGPT Python code to demonstrate how to standardize the name for the employee's name and the date format for the joining date.ChatGPT prompt:Here is the prompt that we can provide in the text format, in case you plan to copy and paste:             Employee ID | Employee Name | Department Name | Joining Date            214                   john Root                  HR                             1-06-2003            435                   STEVE Smith             Retail                          21-Feb-05            654                   Sachin WALA            OPSI                           25-July-1999 Above is the employee data source which should be cleaned. Employee names are not consistent, and the joining date is not in a uniform date format. Generate a Python code to create accurate data. Image 2: Input to the ChatGPTWe pass a dataset and a description of how and for which columns we want to clean the data as seen in the image above. Output from ChatGPTChatGPT automatically creates Python code with a variety of generic functions to clean the specified column in accordance with our specifications. The ChatGPT tool's output Python code is shown below.      Image 3: Output Python code from ChatGPT After running the Python code generated by ChatGPT on the stated data, ChatGPT also displays a sample result on the data here. It is clear that employee names are now uniform, and the joining date is likewise shown using a common date format.             Image 4 : Sample output from ChatGPT This Python code can be used to clean any data source in the future when we need to do so, not just the employee dataset. Therefore, using ChatGPT's capabilities, we can develop a fully automated data cleaning process that is precise, effective, and totally automated.There are also tools on the market like RATH, which has an integration with ChatGPT, to simplify the data analysis workflow and increase your productivity without putting in a lot of manual work if you are having trouble with a large volume of data and need to spend a lot of time performing the data cleaning activity ConclusionThis article gave you a fundamental grasp of the data cleaning/cleansing procedure, which will enable you to use the data to make more trustworthy decisions. The most effective method for using ChatGPT to clean your data simply and effectively for any data quantities. Author Bio:Sagar Lad is a Cloud Data Solution Architect with a leading organisation and has deep expertise in designing and building Enterprise-grade Intelligent Azure Data and Analytics Solutions. He is a published author, content writer, Microsoft Certified Trainer, and C# Corner MVP.You can follow Sagar on - Medium, Amazon, LinkedIn
Read more
  • 0
  • 0
  • 16723

article-image-chatgpt-for-information-retrieval-and-competitive-intelligence
Valentina Alto
02 Jun 2023
2 min read
Save for later

ChatGPT for Information Retrieval and Competitive Intelligence

Valentina Alto
02 Jun 2023
2 min read
This article is an excerpt from the book Modern Generative AI with ChatGPT and OpenAI Models, by Valentina Alto. This book will provide you with insights into the inner workings of the LLMs and guide you through creating your own language models. Information retrieval and competitive intelligence are fields where ChatGPT is a game-changer. It can retrieve information from its knowledge base and reframe it in an original way.One example is using ChatGPT as a search engine to provide summaries, reviews, and recommendations for books:  Alternatively, we could ask for some suggestions for a new book we wish to read based on our preferences:  If we design the prompt with specific information, ChatGPT can serve as a tool for pointing us towards the right references for research or studies. For example, asking ChatGPT to list relevant references for feedforward neural networks:  ChatGPT can also be useful for competitive intelligence. For example, generating a list of existing books with similar content:  Or providing advice on how to be competitive in the market:  ChatGPT can also suggest improvements regarding book content to make it stand out:  Overall, ChatGPT can be a valuable assistant for information retrieval and competitive intelligence. However, it's important to remember that the knowledge base cutoff is 2021, so real-time information may not be available. About the AuthorValentina Alto graduated in 2021 in Data Science. Since 2020 she has been working in Microsoft as Azure Solution Specialist and, since 2022, she focused on Data&AI workloads within the Manufacturing and Pharmaceutical industry. She has been working on customers’ projects closely with system integrators to deploy cloud architecture with a focus on datalake house and DWH, data integration and engineering, IoT and real-time analytics, Azure Machine Learning, Azure cognitive services (including Azure OpenAI Service), and PowerBI for dashboarding. She holds a BSc in Finance and an MSc degree in Data Science from Bocconi University, Milan, Italy. Since her academic journey she has been writing Tech articles about Statistics, Machine Learning, Deep Learning and AI on various publications. She has also written a book about the fundamentals of Machine Learning with Python.  You can connect with Valentina on:LinkedInMedium
Read more
  • 0
  • 0
  • 4923
Unlock access to the largest independent learning library in Tech for FREE!
Get unlimited access to 7500+ expert-authored eBooks and video courses covering every tech area you can think of.
Renews at $19.99/month. Cancel anytime
Modal Close icon
Modal Close icon