Switch to the store?

Thoughtful Data Science

More Information
  • Bridge the gap between developer and data scientist with a Python-based toolset
  • Get the most out of Jupyter Notebooks with new productivity-enhancing tools
  • Explore and visualize data using Jupyter Notebooks and PixieDust
  • Work with and assess the impact of artificial intelligence in data science
  • Work with TensorFlow, graphs, natural language processing, and time series
  • Deep dive into multiple industry data science use cases
  • Look into the future of data analysis and where to develop your skills

Thoughtful Data Science brings new strategies and a carefully crafted programmer's toolset to work with modern, cutting-edge data analysis. This new approach is designed specifically to give developers more efficiency and power to create cutting-edge data analysis and artificial intelligence insights.

Industry expert David Taieb bridges the gap between developers and data scientists by creating a modern open-source, Python-based toolset that works with Jupyter Notebook, and PixieDust. You'll find the right balance of strategic thinking and practical projects throughout this book, with extensive code files and Jupyter projects that you can integrate with your own data analysis.

David Taieb introduces four projects designed to connect developers to important industry use cases in data science. The first is an image recognition application with TensorFlow, to meet the growing importance of AI in data analysis. The second analyses social media trends to explore big data issues and natural language processing. The third is a financial portfolio analysis application using time series analysis, pivotal in many data science applications today. The fourth involves applying graph algorithms to solve data problems. Taieb wraps up with a deep look into the future of data science for developers and his views on AI for data science.

  • Think deeply as a developer about your strategy and toolset in data science
  • Discover the best tools that will suit you as a developer in your data analysis
  • Accelerate the road to data insight as a programmer using Jupyter Notebook
  • Deep dive into multiple industry data science use cases
Page Count 490
Course Length 14 hours 42 minutes
Date Of Publication 30 Jul 2018
Anatomy of a PixieApp
Use @captureOutput decorator to integrate the output of third-party Python libraries
Increase modularity and code reuse
Run Node.js inside a Python Notebook
Getting started with Apache Spark
Twitter sentiment analysis application
Part 1 – Acquiring the data with Spark Structured Streaming
Part 2 – Enriching the data with sentiment and most relevant extracted entity
Part 3 – Creating a real-time dashboard PixieApp
Part 4 – Adding scalability with Apache Kafka and IBM Streams Designer
Getting started with NumPy
Statistical exploration of time series
Putting it all together with the StockExplorer PixieApp
Time series forecasting using the ARIMA model
Introduction to graphs
Getting started with the networkx graph library
Part 1 – Loading the US domestic flight data into a graph
Part 2 – Creating the USFlightsAnalysis PixieApp
Part 3 – Adding data exploration to the USFlightsAnalysis PixieApp
Part 4 – Creating an ARIMA model for predicting flight delays
Forward thinking – what to expect for AI and data science


David Taieb

David Taieb is the Distinguished Engineer for the Watson and Cloud Platform Developer Advocacy team at IBM, leading a team of avid technologists on a mission to educate developers on the art of the possible with data science, AI and cloud technologies. He's passionate about building open source tools, such as the PixieDust Python Library for Jupyter Notebooks, which help improve developer productivity and democratize data science. David enjoys sharing his experience by speaking at conferences and meetups, where he likes to meet as many people as possible.