Reader small image

You're reading from  Building Data Science Solutions with Anaconda

Product typeBook
Published inMay 2022
PublisherPackt
ISBN-139781800568785
Edition1st Edition
Concepts
Right arrow
Author (1)
Dan Meador
Dan Meador
author image
Dan Meador

Dan Meador is an Engineering Manager at Anaconda and is the creator of Conda as well as a champion of open source at Anaconda. With a history of engineering and client facing roles, he has the ability to jump into any position. He has a track record of delivering as a leader and a follower in companies from the Fortune 10 to startups.
Read more about Dan Meador

Right arrow

Chapter 3: Using the Anaconda Distribution to Manage Packages

If software packages are the tools, then a package manager is the A package manager allows you to quickly and effectively find the packages you need and ensures that each tool works seamlessly with each other. It is key to being able to clean data, build models, and create functioning software.

In this chapter, we will take a deeper look at what these packages are, where they live, and how you can use conda and Anaconda tools to incorporate them into your projects. Knowing how to pull in the packages that you need will be the first step in any data science project, so it's vital that you can do this easily.

We'll also create two conda environments and see how you can make use of these to make repeatable projects, share them with your colleagues, or maybe just keep them for yourself.

Finally, you'll discover some more advanced conda features, such as setting up your .condarc configuration file, which...

Technical requirements

To successfully execute the instructions in the chapter, make sure Anaconda Individual Edition is installed. This includes conda and Navigator. This can be downloaded here: https://www.anaconda.com/products/individual.

The conda.yml file, which contains the config information for the conda environment discussed in this chapter, can be found in the GitHub repository: https://github.com/PacktPublishing/Building-Data-Science-Solutions-with-Anaconda/tree/main/Chapter03.

Learning how dependency resolution works

Dependencies are part of the fundamental pieces of software development and data science. Back in Chapter 1, Understanding the AI/ML Landscape, we gave an overview of dependencies by using a cooking example with things that you need to make a certain dish, and how those requirements can conflict with one another, resulting in a tricky situation. We could provide another food analogy, but let's give something that's a little more real to understand – why the alternative of not using anyone else's packages and libraries can be a challenge.

Let's take a look and see whether we can get by without any dependencies for building a simple web application where users can create accounts and pick their favorite movies so you can recommend ones they might like:

  1. First, you will need to get some form of authentication for your project. You will need to brush up on your security skills, such as seeding, hashing functions...

Discovering what conda environments are and how to use them

In this section, you'll learn how to use a key feature of Anaconda that allows you to create separate spaces for your code: environments. You'll gain an understanding of what they are and how to create them in conda and in Navigator, as well as how to share those environments with others.

Let's start by making sure you have a clear understanding of what environments are and how they fit in with the other parts of conda.

There are the three core pillars of conda, environments, packages, and channels, and each is vital to know in order to master conda. We have already learned about packages in Chapter 2, Analyzing Open Source Software. Here, we will go through environments and channels using the analogy of getting food at the grocery store.

The first of the three pillars is environments. With our food analogy, let's say you are making spaghetti, with brownies for dessert. You need the milk and...

Managing channels with Anaconda Navigator and conda

It's time to look at the last pillar of the Anaconda landscape, which is channels. By the end of this section, you'll know what channels are and how to specify them, what the .condarc file is and how to set it up, and how to get the exact package version that you need.

This is a quick note to forewarn you that some of this section might not apply to what you need if you are grabbing more basic packages. Doing things such as setting the priority of channels isn't something you have to worry about, so don't worry too much about knowing these things in detail. It might be incredibly useful, however, as your software becomes more complex.

Let's start with making sure we have a clear idea of what a channel is.

Understanding what a channel is

A channel is simply a repo that contains a specific and intentional group of packages created by an individual or company. It allows you to have a set group of...

Using advanced conda info and settings

There are many layers to conda, and we've just covered the ones you will use first. There are many other operations that you'll find extremely useful and that will, in turn, enable you to work much more easily.

Let's cover how you can find out where conda is looking for its operations and set up a settings file to keep your preferred way of using conda intact through multiple sessions.

Using conda info to see configuration information

You will find yourself troubleshooting and needing to reference where conda looks for certain settings, and for this, conda info has you covered.

Let's run conda info in our ch_3_env environment and see what we get. Some of the output here has been omitted to keep the code a bit more concise:

conda info 
    active environment : base
    active env location : C:\Users\Dan\anaconda3
       user config file...

Conda cheat sheet

Here are the more common conda commands that you'll find yourself using quite often, along with some extra ones that will prove to be handy by way of a quick reference.

Conda general commands

The general commands are as follows:

  • conda install <package>: Searches for and installs the specified package from your channels
  • conda info: Shows the basic information about conda
  • conda config --add channels conda-forge: Adds a channel for searching for packages
  • conda update –all: Updates all packages that it can
  • conda search <package>: Searches the appropriate channel-specific package
  • conda create –-name <environment_name> python=<python version>: Creates a new conda environment and installs the specified Python version
  • conda activate <environment_name>: Activates the specified conda environment that allows you to use it

Conda environment commands

The environment-specific commands...

Summary

By now you've seen how you can create environments with the command line with conda, as well as employing a more visual approach with Navigator. You have an understanding of what dependency management looks like and why it's important to take into account different transitive dependencies and their versions.

You've seen how easy and simple it can be to upload and enable the sharing of your environments so that others can benefit from them.

In addition, we've wrapped up the three pillars of conda by looking at channels and how conda makes use of many different sources for the packages it needs. We've touched on conda-forge as one of those locations.

Much of what you have learned in this chapter will prove to be invaluable in your everyday work and I encourage you to reference the conda cheat sheet here until you no longer need it. You'll be using some commands so often that it won't take long.

Now that we know the Anaconda tools...

lock icon
The rest of the chapter is locked
You have been reading a chapter from
Building Data Science Solutions with Anaconda
Published in: May 2022Publisher: PacktISBN-13: 9781800568785
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
undefined
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $15.99/month. Cancel anytime

Author (1)

author image
Dan Meador

Dan Meador is an Engineering Manager at Anaconda and is the creator of Conda as well as a champion of open source at Anaconda. With a history of engineering and client facing roles, he has the ability to jump into any position. He has a track record of delivering as a leader and a follower in companies from the Fortune 10 to startups.
Read more about Dan Meador