Packt+ | Advance your knowledge in tech

You're reading from Smarter Decisions - The Intersection of Internet of Things and Decision Science

Product typeBook

Published inJul 2016

Reading LevelIntermediate

PublisherPackt

ISBN-139781785884191

Edition1st Edition

Languages

Tools

RStudio

Concepts

Data Science

Author (1)

Jojo Moolayil

Chapter 3. The What and Why - Using Exploratory Decision Science for IoT

Problems in any given scenario always keep evolving and so does the solution. The hypotheses that we define while solving the problem will refine with new findings, which will then change the approach partially or completely. Hence, we need to keep our problem solving approach very agile. The problems we solve are often interconnected in nature; a big problem is often composed as a network of multiple smaller problems. These smaller problems can germinate from completely disparate domains, so we would need to accommodate diversity in our approach. Also, the solution can have different approaches based on the problem's scenario. The approach could be top-down, bottom-up, or hybrid; therefore, our solutions need to be flexible. Lastly, the problem can inflate to a mammoth size, thus our solutions need to be scalable.

In this chapter, we will solve the business problem that we defined in Chapter 2, Studying the IoT Problem...

Identifying gold mines in data for decision making

As a first step, before we dig deeper into the data exploration and analysis phase, we need to identify the gold mines in data. In the previous chapter, we designed the heuristic-driven hypotheses (HDH) while defining the problem. We now need to revisit the list and explore it to understand whether we are in a position to solve the problem using the data. We will be able to do this by examining and validating the data sources for the identified hypotheses. In case we do not have data to prove/disprove majority of our important hypotheses, it would not add any value by proceeding any further with the current approach. With data being available, we can get our hands dirty with codes for the solution.

Examining data sources for the hypotheses

If we take a look at the Prioritize and structure hypotheses based on the availability of data section in the previous chapter, we can see that we have listed a couple of hypotheses that could be potential...

Exploring each dimension of the IoT Ecosystem through data (Univariates)

Let's dig deeper into each dimension in the IoT use case to understand more realistically what the data showcases. We will perform extensive univariate analysis to study and visualize the entire data landscape.

What does the data say?

We visited the data dimensions while exploring the gold mines in data (in the previous section) and understood that Product_Qty_Unit, Product_ID, Material_ID, and Product_Name indicate that the columns contain a single value. Therefore, we conclude that the data in the use case is provided for a specific product and its output is measured in Kgs. Let's start exploring Order Quantity and Produced Quantity in depth. We initially studied the data dimensions using summary commands that gave us the percentile distribution. Let's take this one step further.

Order Quantity and Produced Quantity are both continuous variables, that is, a variable that can have infinite number of values possible (say...

Studying relationships

The end result of the produce from the manufacturing plant is whether it can be accepted as a good quality product or discarded due to bad quality. This status for each manufacturing exercise is identified in the data using the 'Detergent_Quality' dimension, which is calculated using some weighted algorithm by taking into account the four output quality parameters of the end detergent produced. Our end goal is to find out the reasons why the final product was not accepted, which shows that we need to study why the output quality was bad. The reasons could be many, but how do we identify them? This is when the task of studying relationships is presented to the decision scientist. We have with us plenty of independent variables that are either continuous or categorical. Trying to understand how these independent dimensions eventually contribute to the end output is where we start studying the relationship between them. The entire exercise can be simply defined as bivariate...

Exploratory data analysis

This part of the problem solving stack is also called "Confirmatory data analysis". Generally, the problems that we touch base over the Internet and other learning resources explain a stack called "ECR" that can be extended as Exploratory Data Analysis + Confirmatory Data Analysis + Root Cause Analysis. This is the same approach that we have considered-Exploratory Data Analysis (EDA)-where we understand "What" happened, then CDA, that is, Confirmatory Data Analysis, where we cement the results from our exercises using statistical tests. Finally, we will answer the "Why" question using Root Cause Analysis. In our current approach, we have the same approach but a slightly different naming convention. We have broken down the steps into more granular ones:

We have now reached the EDA phase, that is, we will now validate the insights and patterns that we observed in the data. Let's start with understanding how we are going to approach this. If we look back at the journey...

Root Cause Analysis

We now begin our journey with answering the why question from all the insights we have gathered till now. Let's assimilate all our results that we have validated in our EDA exercise. Once we have all the results, let's try to simplify it to create a simple story that helps us in answering the questions in a more lucid way.

The following figure is an extended version of the DDH matrix we designed in the previous section along with the results we found during our exercise:

Summary

In this chapter, we moved one step ahead in solving a real-life IoT business use case. Using the blueprint of the problem that we defined in the previous chapter, we attempted solving the problem in a structured way guided by the problem solving framework. After having the business problem well-defined, we got our hands dirty by solving the business problem using R. We started our journey with identifying gold mines in the data for decision making, where we examined the data sources to understand what hypotheses can we prove to solve our problem. We then validated the fact that we have a good amount of data to solve the problem and studied more about the data to understand how the data can be used in our use case. After gathering a fair amount of data and domain context, we explored each dimension in the IoT ecosystem and studied what the data has to say. We performed univariate analysis and also transformed the dimensions to create more powerful and valuable dimensions. We then...

The rest of the chapter is locked

You have been reading a chapter from

Smarter Decisions - The Intersection of Internet of Things and Decision Science

Published in: Jul 2016Publisher: PacktISBN-13: 9781785884191

A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.

undefined

Unlock this book and the full library FREE for 7 days

Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of

Start free trial

Renews at $15.99/month. Cancel anytime

Author (1)

Jojo Moolayil

Jojo Moolayil is a data scientist, living in Bengaluru—the silicon valley of India. With over 4 years of industrial experience in Decision Science and IoT, he has worked with industry leaders on high impact and critical projects across multiple verticals. He is currently associated with GE, the pioneer and leader in data science for Industrial IoT. Jojo was born and raised in Pune, India and graduated from University of Pune with a major in information technology engineering. With a vision to solve problems at scale, Jojo found solace in decision science and learnt to solve a variety of problems across multiple industry verticals early in his career. He started his career with Mu Sigma Inc., the world's largest pure play analytics provider where he worked with the leaders of many fortune 50 clients. With the passion to solve increasingly complex problems, Jojo touch based with Internet of Things and found deep interest in the very promising area of consumer and industrial IoT. One of the early enthusiasts to venture into IoT analytics, Jojo converged his learnings from decision science to bring the problem solving frameworks and his learnings from data and decision science to IoT. To cement his foundations in industrial IoT and scale the impact of the problem solving experiments, he joined a fast growing IoT Analytics startup called Flutura based in Bangalore and headquartered in the valley. Flutura focuses exclusively on Industrial IoT and specializes in analytics for M2M data. It is with Flutura, where Jojo reinforced his problem solving skills for M2M and Industrial IoT while working for the world's leading manufacturing giant and lighting solutions providers. His quest for solving problems at scale brought the 'product' dimension in him naturally and soon he also ventured into developing data science products and platforms. After a short stint with Flutura, Jojo moved on to work with the leaders of Industrial IoT, that is, G.E. in Bangalore, where he focused on solving decision science problems for Industrial IoT use cases. As a part of his role in GE, Jojo also focuses on developing data science and decision science products and platforms for Industrial IoT.
Read more about Jojo Moolayil

Personalised recommendations for you

Based on your interests and search pattern

Et al.

Ever wonder why speech recognition systems don't understand the Scottish accent, or what would happen if an astronaut only ate mac 'n' cheese, or other spurious reflections you'd have at a bar? We did, then collated those deliberations into absurd research articles with fake figures and methodologies inspired by even more fictionally absurd studies.

BookAug 2023230 pages5

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages4

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages5

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages1

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages5

Mastering Tableau 2023

This book is a comprehensive resource to mastering your Tableau skills and becoming a BI expert. As you progress, you will learn how to build advanced dashboards and improve your storytelling to derive key business insight, as well as make you well-versed with advanced functionalities of Tableau in the business intelligence domain.

BookAug 2023684 pages

Building AI Applications with ChatGPT APIs

This guide covers all ChatGPT API features for effortless creation of robust AI powered apps. With its help, you’ll be able to leverage ChatGPT’s cutting-edge NLP models to take your app development skills to the next level. You’ll also work on ten exciting projects that will give you the practical know-how that you can apply to your existing applications.

BookSep 2023258 pages5

Building AI Applications with ChatGPT APIs

This guide covers all ChatGPT API features for effortless creation of robust AI powered apps. With its help, you’ll be able to leverage ChatGPT’s cutting-edge NLP models to take your app development skills to the next level. You’ll also work on ten exciting projects that will give you the practical know-how that you can apply to your existing applications.

BookSep 2023258 pages2

Data Engineering with AWS

Embark on a journey to master data engineering pipelines on AWS! Our book offers a hands-on experience of AWS services for ingesting, transforming, and consuming data. Whether you're an absolute beginner or someone with basic data engineering experience, this guide is an indispensable resource.

BookOct 2023636 pages5

Modern Data Architecture on AWS

Every organization wants an agile, performant, and cost-effective data platform that meets all their current and future business needs. Purpose-built AWS analytics services and their features play a big part in building such a modern data platform. This book brings to you all the design and architectural patterns that’ll help you achieve this goal.

BookAug 2023420 pages5

Practical Guide to Applied Conformal Prediction in Python

Discover the power of Conformal Prediction with the "Practical Guide to Applied Conformal Prediction in Python." Master the latest techniques to quantify uncertainty in machine learning and computer vision models, and seamlessly apply them to your industry applications.

BookDec 2023240 pages

TinyML Cookbook

With over 70 project-based recipes, the TinyML Cookbook is a practical guide that will help you to get the most out of your microcontrollers. It provides a comprehensive understanding of the theoretical foundations while giving you hands-on experience training ML models for deployment on Arduino Nano 33 BLE Sense, Raspberry Pi Pico, and SparkFun RedBoard Artemis Nano microcontrollers.

BookNov 2023664 pages

Hypothesis	Result	Insight
Line 1 has an overall higher chance of manufacturing more number of bad quality detergent products	FALSE	Assembly Line has no impact on the end quality of the detergent
Line 1 has an overall higher chance of deteriorating the Output Quality Parameters in the detergent	TRUE	Assembly line has an impact on Output Quality Parameter 2,3, and 4
As the deviation between Order Quantity and actual Produced Quantity increases, the chance of the bad quality detergent being...