Reader small image

You're reading from  Developing Kaggle Notebooks

Product typeBook
Published inDec 2023
Reading LevelIntermediate
PublisherPackt
ISBN-139781805128519
Edition1st Edition
Languages
Right arrow
Author (1)
Gabriel Preda
Gabriel Preda
author image
Gabriel Preda

Dr. Gabriel Preda is a Principal Data Scientist for Endava, a major software services company. He has worked on projects in various industries, including financial services, banking, portfolio management, telecom, and healthcare, developing machine learning solutions for various business problems, including risk prediction, churn analysis, anomaly detection, task recommendations, and document information extraction. In addition, he is very active in competitive machine learning, currently holding the title of a three-time Kaggle Grandmaster and is well-known for his Kaggle Notebooks.
Read more about Gabriel Preda

Right arrow

Starbucks in the World

We start the analysis of Starbucks Locations Worldwide dataset with a detailed exploratory data analysis (EDA) in the notebook Starbucks Location Worldwide - Data Exploration. The tools used in this dataset are imported from data_quality_stats and from plot_style_utils utility scripts. Before starting our analysis, it is important to explain that the dataset used for this analysis is from Kaggle and was collected 6 years ago. Meantime, Starbucks business expanded very much and therefore the number of shops, the geographical distribution of the shops, all this information is not up to date.

Preliminary data analysis

The dataset has 25,600 rows, with only 1 latitude and longitude values missing, 2 Street Addresses, 15 Cities. The fields that have the most missing data are Postcode (5.9%) and Phone Number (26.8%). In Figure 3.16 we can see few a sample of the data.

Figure 4.16. First rows of Starbucks dataset

Looking to the most frequent values report, we can learn...

lock icon
The rest of the page is locked
Previous PageNext Page
You have been reading a chapter from
Developing Kaggle Notebooks
Published in: Dec 2023Publisher: PacktISBN-13: 9781805128519

Author (1)

author image
Gabriel Preda

Dr. Gabriel Preda is a Principal Data Scientist for Endava, a major software services company. He has worked on projects in various industries, including financial services, banking, portfolio management, telecom, and healthcare, developing machine learning solutions for various business problems, including risk prediction, churn analysis, anomaly detection, task recommendations, and document information extraction. In addition, he is very active in competitive machine learning, currently holding the title of a three-time Kaggle Grandmaster and is well-known for his Kaggle Notebooks.
Read more about Gabriel Preda