You're reading from Python Real-World Projects

Product typeBook

Published inSep 2023

PublisherPackt

ISBN-139781803246765

Edition1st Edition

Concepts

Programming Language

Author (1)

Steven F. Lott

Chapter 17
Next Steps

The journey from raw data to useful information has only begun. There are often many more steps to getting insights that can be used to support enterprise decision-making. From here, the reader needs to take the initiative to extend these projects, or consider other projects. Some readers will want to demonstrate their grasp of Python while others will go more deeply into the area of exploratory data analysis.

Python is used for so many different things that it seems difficult to even suggest a direction for deeper understanding of the language, the libraries, and the various ways Python is used.

In this chapter, we’ll touch on a few more topics related to exploratory data analysis. The projects in this book are only a tiny fraction of the kinds of problems that need to be solved on a daily basis.

Every analyst needs to balance the time between understanding the enterprise data being processed, searching for better ways to model the data, and effective...

17.1 Overall data wrangling

The applications and notebooks are designed around the following multi-stage architecture:

Data acquisition
Inspection of data
Cleaning data; this includes validating, converting, standardizing, and saving intermediate results
Summarizing, and the start of modeling data
Creating deeper analysis and more sophisticated statistical models

The stages fit together as shown in Figure 17.1.

The last step in this pipeline isn’t — of course — final. In many cases, the project evolves from exploration to monitoring and maintenance. There will be a long tail where the model continues to be confirmed. Some enterprise management oversight is an essential part of this ongoing confirmation.

In some cases, the long tail is interrupted by a change. This may be reflected by a model’s inaccuracy. There may be a failure to pass basic statistical tests. Uncovering the change and the reasons for change is...

17.2 The concept of “decision support”

The core concept behind all data processing, including analytics and modeling, is to help some person make a decision. Ideally, a good decision will be based on sound data.

In many cases, decisions are made by software. Sometimes the decisions are simple rules that identify bad data, incomplete processes, or invalid actions. In other cases, the decisions are more nuanced, and we apply the term “artificial intelligence” to the software making the decision.

While many kinds of software applications make many automated decisions, a person is still — ultimately — responsible for those decisions being correct and consistent. This responsibility may be implemented as a person reviewing a periodic summary of decisions made.

This responsible stakeholder needs to understand the number and types of decisions being made by application software. They need to confirm the automated decisions reflect sound data as well...

17.3 Concept of metadata and provenance

The description of a dataset includes three important aspects:

The syntax or physical format and logical layout of the data
The semantics, or meaning, of the data
The provenance, or the origin and transformations applied to the data

The physical format of a dataset is often summarized using the name of a well-known file format. For example, the data may be in CSV format. The order of columns in a CSV file may change, leading to a need to have headings or some metadata describing the logical layout of the columns within a CSV file.

Much of this information can be enumerated in JSON schema definitions.

In some cases, the metadata might be yet another CSV file that has column numbers, preferred data types, and column names. We might have a secondary CSV file that looks like the following example:

1,height,height in inches
2,weight,weight in pounds
3,price,price in dollars

This metadata information describes the contents of a separate CSV file with...

17.4 Next steps toward machine learning

We can draw a rough boundary between statistical modeling and machine learning. This is a hot topic of debate because — viewed from a suitable distance — all statistical modeling can be described as machine learning.

In this book, we’ve drawn a boundary to distinguish methods based on algorithms that are finite, definite, and effective. For example, the process of using the linear least squares technique to find a function that matches data is generally reproducible with an exact closed-form answer that doesn’t require tuning hyperparameters.

Even within our narrow domain of “statistical modeling,” we can encounter data sets for which linear least squares don’t behave well. One notable assumption of the least squares estimates, for example, is that the independent variables are all known exactly. If the x values are subject to observational error, a more sophisticated approach is required.

The...

The rest of the chapter is locked

You have been reading a chapter from

Python Real-World Projects

Published in: Sep 2023Publisher: PacktISBN-13: 9781803246765

A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.

undefined

Unlock this book and the full library FREE for 7 days

Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of

Start free trial

Renews at $15.99/month. Cancel anytime

Author (1)

Steven F. Lott

Steven Lott has been programming since computers were large, expensive, and rare. Working for decades in high tech has given him exposure to a lot of ideas and techniques, some bad, but most are helpful to others. Since the 1990s, Steven has been engaged with Python, crafting an array of indispensable tools and applications. His profound expertise has led him to contribute significantly to Packt Publishing, penning notable titles like "Mastering Object-Oriented," "The Modern Python Cookbook," and "Functional Python Programming." A self-proclaimed technomad, Steven's unconventional lifestyle sees him residing on a boat, often anchored along the vibrant east coast of the US. He tries to live by the words “Don't come home until you have a story.”
Read more about Steven F. Lott

Personalised recommendations for you

Based on your interests and search pattern

C++ Programming for Linux Systems

This book covers the essential system programming tools and helps you explore the features of C++20. It emphasizes important details to maintain code quality and tackle everyday challenges of developing software for high performance, optimization, and more.

BookSep 2023288 pages

Expert C++

Discover advanced programming techniques, the latest features of C++17 and C++20, and best practices for memory management, debugging, testing, and large-scale application design with Expert C++. Ideal for experienced developers advancing to proficient programmers and building professional-grade C++ applications.

BookAug 2023604 pages

iOS 17 Programming for Beginners

iOS 17 Programming for Beginners, Eighth Edition is your comprehensive guide to learning the art of iOS app development. Whether you dream of creating the next chart-topping app or simply want to enhance your programming skills, this book is your trusted companion on this exciting journey.

BookOct 2023604 pages4

Developer Career Masterplan

Written by industry experts that have spent the last 20+ years helping developers grow their career path towards senior developer positions and beyond. This book provides a comprehensive guide, sharing examples and stories from their global careers. By the end, you’ll have the knowledge to create a clear career progression plan as a technical professional.

BookSep 2023310 pages

Refactoring with C#

In Refactoring with C#, you’ll explore the process of safely refactoring modern .NET code using Visual Studio features, advanced unit tests, AI assistance, and custom Roslyn analyzers.

BookNov 2023434 pages

Python Real-World Projects

Amplify your developer journey by curating a dynamic project portfolio that outshines traditional resumes. Delve into the Python realm through immersive projects, mastering core concepts while constructing comprehensive modules and applications. From data acquisition prowess to impactful data visualization, Python Real-World Projects arms you with essential skills to beat the competition.

BookSep 2023478 pages5

The MVVM Pattern in .NET MAUI

The MVVM Pattern in .NET MAUI enables developers to master MVVM principles and effectively apply them to .NET MAUI. This book uses real-life examples and covers complex problems to help you successfully apply MVVM with .NET MAUI to confidently develop robust and high-performing cross-platform apps.

BookNov 2023386 pages

Extending Microsoft Business Central with Power Platform

Extending Business Central with the Power Platform is a step-by-step guide for Business Central professionals to create solutions that automate business processes, explain complex workflow approvals, and integrate with hundreds of other systems, without traditional development. It’ll guide you in customizing Business Central with Power Platform.

BookAug 2023458 pages5

Extending Microsoft Business Central with Power Platform

Extending Business Central with the Power Platform is a step-by-step guide for Business Central professionals to create solutions that automate business processes, explain complex workflow approvals, and integrate with hundreds of other systems, without traditional development. It’ll guide you in customizing Business Central with Power Platform.

BookAug 2023458 pages5

Quantum Computing Algorithms

The book emphasizes intuitive ideas behind quantum algorithms in ways that other books don’t cover, striking a careful balance between no math and too much math. To get the most from this book, you should be comfortable with basic algebra and writing simple computer code. No prior understanding of quantum physics is needed to get started.

BookSep 2023342 pages

Python – Complete Python, Django, Data Science and ML Guide

Unlock Python's full potential with this 50+ hour course! From programming to web and game development, data manipulation, and machine learning, gain the skills required to succeed in various Python-related careers. With practical tasks, hands-on experience, and a strong foundation in Python, you'll be ready to tackle real-world challenges and take advantage of the many opportunities this versatile language offers.

VideoNov 202350 hours 30 minutes5

Python – Complete Python, Django, Data Science and ML Guide

Unlock Python's full potential with this 50+ hour course! From programming to web and game development, data manipulation, and machine learning, gain the skills required to succeed in various Python-related careers. With practical tasks, hands-on experience, and a strong foundation in Python, you'll be ready to tackle real-world challenges and take advantage of the many opportunities this versatile language offers.

VideoNov 202350 hours 30 minutes5

You're reading from Python Real-World Projects

Chapter 17 Next Steps

17.1 Overall data wrangling

17.2 The concept of “decision support”

17.3 Concept of metadata and provenance

17.4 Next steps toward machine learning

Why subscribe?

Unlock this book and the full library FREE for 7 days

Author (1)

C++ Programming for Linux Systems

This book covers the essential system programming tools and helps you explore the features of C++20. It emphasizes important details to maintain code quality and tackle everyday challenges of developing software for high performance, optimization, and more.

Expert C++

iOS 17 Programming for Beginners

iOS 17 Programming for Beginners, Eighth Edition is your comprehensive guide to learning the art of iOS app development. Whether you dream of creating the next chart-topping app or simply want to enhance your programming skills, this book is your trusted companion on this exciting journey.

Developer Career Masterplan

Refactoring with C#

In Refactoring with C#, you’ll explore the process of safely refactoring modern .NET code using Visual Studio features, advanced unit tests, AI assistance, and custom Roslyn analyzers.

Python Real-World Projects

The MVVM Pattern in .NET MAUI

The MVVM Pattern in .NET MAUI enables developers to master MVVM principles and effectively apply them to .NET MAUI. This book uses real-life examples and covers complex problems to help you successfully apply MVVM with .NET MAUI to confidently develop robust and high-performing cross-platform apps.

Extending Microsoft Business Central with Power Platform

Extending Microsoft Business Central with Power Platform

Quantum Computing Algorithms

Python – Complete Python, Django, Data Science and ML Guide

Python – Complete Python, Django, Data Science and ML Guide

Chapter 17
Next Steps