Hands-On Data Science with Anaconda

More Information
Learn
  • Perform cleaning, sorting, classification, clustering, regression, and dataset modeling using Anaconda
  • Use the package manager conda and discover, install, and use functionally efficient and scalable packages
  • Get comfortable with heterogeneous data exploration using multiple languages within a project
  • Perform distributed computing and use Anaconda Accelerate to optimize computational powers
  • Discover and share packages, notebooks, and environments, and use shared project drives on Anaconda Cloud
  • Tackle advanced data prediction problems
About

Anaconda is an open source platform that brings together the best tools for data science professionals with more than 100 popular packages supporting Python, Scala, and R languages. Hands-On Data Science with Anaconda gets you started with Anaconda and demonstrates how you can use it to perform data science operations in the real world.

The book begins with setting up the environment for Anaconda platform in order to make it accessible for tools and frameworks such as Jupyter, pandas, matplotlib, Python, R, Julia, and more. You’ll walk through package manager Conda, through which you can automatically manage all packages including cross-language dependencies, and work across Linux, macOS, and Windows. You’ll explore all the essentials of data science and linear algebra to perform data science tasks using packages such as SciPy, contrastive, scikit-learn, Rattle, and Rmixmod.

Once you’re accustomed to all this, you’ll start with operations in data science such as cleaning, sorting, and data classification. You’ll move on to learning how to perform tasks such as clustering, regression, prediction, and building machine learning models and optimizing them. In addition to this, you’ll learn how to visualize data using the packages available for Julia, Python, and R.

Features
  • Use Anaconda to find solutions for clustering, classification, and linear regression
  • Analyze your data efficiently with the most powerful data science stack
  • Use the Anaconda cloud to store, share, and discover projects and libraries
Page Count 364
Course Length 10 hours 55 minutes
ISBN 9781788831192
Date Of Publication 30 May 2018
Introduction to unsupervised learning
Hierarchical clustering
k-means clustering
Introduction to Python packages – scipy
Introduction to Python packages – contrastive
Introduction to Python packages – sklearn (scikit-learn)
Introduction to R packages – rattle
Introduction to R packages – randomUniformForest
Introduction to R packages – Rmixmod
Implementation using Julia
Task view for Cluster Analysis
Summary
Review questions and exercises

Authors

James Yan

James Yan is an undergraduate student at the University of Toronto (UofT), currently double-majoring in computer science and statistics. He has hands-on knowledge of Python, R, Java, MATLAB, and SQL. During his study at UofT, he has taken many related courses, such as Methods of Data Analysis I and II, Methods of Applied Statistics, Introduction to Databases, Introduction to Artificial Intelligence, and Numerical Methods, including a capstone course on AI in clinical medicine.

Dr. Yuxing Yan

Dr. Yuxing Yan graduated from McGill University with a PhD in Finance. He has taught various finance courses at eight universities in Canada, Singapore, and the U.S. He has published 23 research and teaching-related papers and is the author of six books. Two of his recent publications are Python for Finance and Financial Modeling Using R. He is well-versed in R, Python, SAS, MATLAB, Octave, and C. In addition, he is an expert on financial data analytics.