You're reading from Hands-On Data Science with Anaconda

Product typeBook

Published inMay 2018

Reading LevelIntermediate

PublisherPackt

ISBN-139781788831192

Edition1st Edition

Languages

Python

Tools

Jupyter Anaconda

Concepts

Data Science

Authors (2):

Yuxing Yan

James Yan

View More author details

What this book covers

Chapter 1, Ecosystem of Anaconda, introduces some basic concepts such as the reasons why we use Anaconda and the advantages of using a full-fledged Anaconda and/or its baby version, Miniconda. Then, it covers the use of Anaconda online, without installation. We also test a few simple programs, written in R, Python, Julia, and Octave.

Chapter 2, Anaconda Installation, shows how to install Anaconda, test whether the installation is successful, how to launch Jupyter and use it to launch Python, how to launch Spyder and R, and how to find help. Most of these concepts or procedures are quite basic, so users who are quite confident with them can skip this chapter and go directly to the next chapter.

Chapter 3, Data Basics, discusses sources of open data, which include the Bureau of Labor Statistics, the Census Bureau, Professor French’s Data Library, the Federal Reserve’s Data Library, and the UCI (University of California at Irvin) Machine Learning Repository. After that, it explains how to input data; how to deal with missing data; how to sort, slice, and dice datasets; how to merge different datasets and data output. For different languages, such as Python, R, Julia and Octave, several relevant packages for data manipulation are introduced and discussed.

Chapter 4, Data Visualization, discusses various types of visual presentations, which include simple graphs, bar charts, pie charts, and histograms, written in different languages such as R, Python, and Julia. Visual presentations can help our audience understand our data better. For many complex concepts or theories, we could use visual presentations to help explain their logic and complexity. A typical example is the so-called bisection method or bisection search.

Chapter 5, Statistical Modeling in Anaconda, explains many important issues related to statistics, such as T-distribution, F-distribution, T-test, and F-test. We also discuss linear regression, how to deal with missing data, how to treat outliers, collinearity and its treatments, and how to run a multi-variable linear regression.

Chapter 6, Managing Packages, explains the importance of managing packages, how to find out all packages available for R, Python, and Julia, and how to find the manual for each package. In addition, we discuss the issue of package dependency and how to make our programming a little easier when dealing with packages.

Chapter 7, Optimization in Anaconda, discusses several optimization topics, including general optimization problems, expressing various kinds of optimization problems as LPPs, and quadratic optimization. Several examples are offered to make our discussion more practice-oriented, such as how to choose an optimal stock portfolio, how to optimize wealth and resources to promote sustainable development, and how much the government should really tax people. In addition, we introduce several packages for optimization in R, Python, Julia, and Octave.

Chapter 8, Unsupervised Learning in Anaconda, covers unsupervised learning. In particular, hierarchical clustering and k-means clustering are covered. As for R and Python, several related packages are looked at in details. For R: rattle, Rmixmod, and randomUniformForest; For Python: Scipy.cluster, Contrastive, and sklearn.

Chapter 9, Supervised Learning in Anaconda, discusses supervised learning, including classification, k-nearest neighbors algorithm, Bayes' classifiers, reinforcement learning, and specific R and Python-related modules, such as RTextTools and sklearn. In addition, you will see their implementation in R, Python, Julia, and Octave.

Chapter 10, Predictive Data Analytics – Modelling and Validation, covers predictive data analytics, modeling and validation, some useful datasets, time series analytics, how to predict future events, seasonality, and how to visualize our data. We mention prsklearn and catwalk for Python, datarobot, LiblineaR, and eclust for R, QuantEcon for Julia and ltfat for Octave.

Chapter 11, Anaconda Cloud, discusses Anaconda Cloud. Some topics include Jupyter Notebook in depth, different formats of Jupyter notebooks, how to share notebooks with your partners, how to share different projects over different platforms, how to share your working environments, and how to replicate other's environments locally.

Chapter 12, Distributed Computing, Parallel Computing, and HPCC, covers distributed computing and Anaconda Accelerate. When our data or tasks become more complex, we need a good system or a set of tools to process data and run complex algorithms. For this purpose, distributed computing is one solution. In particular, we will explain compute nodes, project add-ons, parallel processing, and advanced Python for data parallelism.

The rest of the page is locked

You have been reading a chapter from

Hands-On Data Science with Anaconda

Published in: May 2018Publisher: PacktISBN-13: 9781788831192

Authors (2)

Yuxing Yan

Yuxing Yan graduated from McGill University with a PhD in finance. Over the years, he has been teaching various finance courses at eight universities: McGill University and Wilfrid Laurier University (in Canada), Nanyang Technological University (in Singapore), Loyola University of Maryland, UMUC, Hofstra University, University at Buffalo, and Canisius College (in the US). His research and teaching areas include: market microstructure, open-source finance and financial data analytics. He has 22 publications including papers published in the Journal of Accounting and Finance, Journal of Banking and Finance, Journal of Empirical Finance, Real Estate Review, Pacific Basin Finance Journal, Applied Financial Economics, and Annals of Operations Research. He is good at several computer languages, such as SAS, R, Python, Matlab, and C. His four books are related to applying two pieces of open-source software to finance: Python for Finance (2014), Python for Finance (2nd ed., expected 2017), Python for Finance (Chinese version, expected 2017), and Financial Modeling Using R (2016). In addition, he is an expert on data, especially on financial databases. From 2003 to 2010, he worked at Wharton School as a consultant, helping researchers with their programs and data issues. In 2007, he published a book titled Financial Databases (with S.W. Zhu). This book is written in Chinese. Currently, he is writing a new book called Financial Modeling Using Excel — in an R-Assisted Learning Environment. The phrase "R-Assisted" distinguishes it from other similar books related to Excel and financial modeling. New features include using a huge amount of public data related to economics, finance, and accounting; an efficient way to retrieve data: 3 seconds for each time series; a free financial calculator, showing 50 financial formulas instantly, 300 websites, 100 YouTube videos, 80 references, paperless for homework, midterms, and final exams; easy to extend for instructors; and especially, no need to learn R.
Read more about Yuxing Yan

James Yan

James Yan is an undergraduate student at the University of Toronto (UofT), currently double-majoring in computer science and statistics. He has hands-on knowledge of Python, R, Java, MATLAB, and SQL. During his study at UofT, he has taken many related courses, such as Methods of Data Analysis I and II, Methods of Applied Statistics, Introduction to Databases, Introduction to Artificial Intelligence, and Numerical Methods, including a capstone course on AI in clinical medicine.
Read more about James Yan

Personalised recommendations for you

Based on your interests and search pattern

Et al.

Ever wonder why speech recognition systems don't understand the Scottish accent, or what would happen if an astronaut only ate mac 'n' cheese, or other spurious reflections you'd have at a bar? We did, then collated those deliberations into absurd research articles with fake figures and methodologies inspired by even more fictionally absurd studies.

BookAug 2023230 pages5

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages4

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages5

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages1

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages5

Mastering Tableau 2023

This book is a comprehensive resource to mastering your Tableau skills and becoming a BI expert. As you progress, you will learn how to build advanced dashboards and improve your storytelling to derive key business insight, as well as make you well-versed with advanced functionalities of Tableau in the business intelligence domain.

BookAug 2023684 pages

Building AI Applications with ChatGPT APIs

This guide covers all ChatGPT API features for effortless creation of robust AI powered apps. With its help, you’ll be able to leverage ChatGPT’s cutting-edge NLP models to take your app development skills to the next level. You’ll also work on ten exciting projects that will give you the practical know-how that you can apply to your existing applications.

BookSep 2023258 pages5

Building AI Applications with ChatGPT APIs

This guide covers all ChatGPT API features for effortless creation of robust AI powered apps. With its help, you’ll be able to leverage ChatGPT’s cutting-edge NLP models to take your app development skills to the next level. You’ll also work on ten exciting projects that will give you the practical know-how that you can apply to your existing applications.

BookSep 2023258 pages2

Data Engineering with AWS

Embark on a journey to master data engineering pipelines on AWS! Our book offers a hands-on experience of AWS services for ingesting, transforming, and consuming data. Whether you're an absolute beginner or someone with basic data engineering experience, this guide is an indispensable resource.

BookOct 2023636 pages5

Modern Data Architecture on AWS

Every organization wants an agile, performant, and cost-effective data platform that meets all their current and future business needs. Purpose-built AWS analytics services and their features play a big part in building such a modern data platform. This book brings to you all the design and architectural patterns that’ll help you achieve this goal.

BookAug 2023420 pages5

Practical Guide to Applied Conformal Prediction in Python

Discover the power of Conformal Prediction with the "Practical Guide to Applied Conformal Prediction in Python." Master the latest techniques to quantify uncertainty in machine learning and computer vision models, and seamlessly apply them to your industry applications.

BookDec 2023240 pages

TinyML Cookbook

With over 70 project-based recipes, the TinyML Cookbook is a practical guide that will help you to get the most out of your microcontrollers. It provides a comprehensive understanding of the theoretical foundations while giving you hands-on experience training ML models for deployment on Arduino Nano 33 BLE Sense, Raspberry Pi Pico, and SparkFun RedBoard Artemis Nano microcontrollers.

BookNov 2023664 pages

You're reading from Hands-On Data Science with Anaconda

Unlock this book and the full library FREE for 7 days

Authors (2)

Et al.

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

Mastering Tableau 2023

Building AI Applications with ChatGPT APIs

Building AI Applications with ChatGPT APIs

Data Engineering with AWS

Embark on a journey to master data engineering pipelines on AWS! Our book offers a hands-on experience of AWS services for ingesting, transforming, and consuming data. Whether you're an absolute beginner or someone with basic data engineering experience, this guide is an indispensable resource.

Modern Data Architecture on AWS

Practical Guide to Applied Conformal Prediction in Python

Discover the power of Conformal Prediction with the "Practical Guide to Applied Conformal Prediction in Python." Master the latest techniques to quantify uncertainty in machine learning and computer vision models, and seamlessly apply them to your industry applications.

TinyML Cookbook