Search icon
Subscription
0
Cart icon
Close icon
You have no products in your basket yet
Save more on your purchases!
Savings automatically calculated. No voucher code required
Arrow left icon
All Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Newsletters
Free Learning
Arrow right icon
Julia for Data Science

You're reading from  Julia for Data Science

Product type Book
Published in Sep 2016
Publisher Packt
ISBN-13 9781785289699
Pages 346 pages
Edition 1st Edition
Languages
Author (1):
Anshul Joshi Anshul Joshi
Profile icon Anshul Joshi

Table of Contents (17) Chapters

Julia for Data Science
Credits
About the Author
About the Reviewer
www.PacktPub.com
Preface
1. The Groundwork – Julia's Environment 2. Data Munging 3. Data Exploration 4. Deep Dive into Inferential Statistics 5. Making Sense of Data Using Visualization 6. Supervised Machine Learning 7. Unsupervised Machine Learning 8. Creating Ensemble Models 9. Time Series 10. Collaborative Filtering and Recommendation System 11. Introduction to Deep Learning

Package management


Julia provides a built-in package manager. Using Pkg we can install libraries written in Julia. For external libraries, we can also compile them from their source or use the standard package manager of the operating system. A list of registered packages is maintained at http://pkg.julialang.org.

Pkg is provided in the base installation. The Pkg module contains all the package manager commands.

Pkg.status() – package status

The Pkg.status() is a function that prints out a list of currently installed packages with a summary. This is handy when you need to know if the package you want to use is installed or not.

When the Pkg command is run for the first time, the package directory is automatically created. It is required by the command that the Pkg.status() returns a valid list of the packages installed. The list of packages given by the Pkg.status() are of registered versions which are managed by Pkg.

Pkg.installed() can also be used to return a list of all the installed packages with their versions.

Pkg.add() – adding packages

Julia's package manager is declarative and intelligent. You only have to tell it what you want and it will figure out what version to install and will resolve dependencies if there are any. Therefore, we only need to add the list of requirements that we want and it resolves which packages and their versions to install.

The ~/.julia/v0.4/REQUIRE file contains the package requirements. We can open it using a text editor such as vi or atom, or use Pkg.edit() in Julia's shell to edit this file. After editing the file, run Pkg.resolve() to install or remove the packages.

We can also use Pkg.add(package_name) to add packages and Pkg.rm(package_name) to remove packages. Earlier, we used Pkg.add("IJulia")  to install the IJulia package.

When we don't want to have a package installed on our system anymore, Pkg.rm() is used for removing the requirement from the REQUIRE file. Similar to Pkg.add(), Pkg.rm() first removes the requirement of the package from the REQUIRE file and then updates the list of installed packages by running Pkg.resolve() to match.

Working with unregistered packages

Frequently, we would like to be able to use packages created by our team members or someone who has published on Git but they are not in the registered packages of Pkg. Julia allows us to do that by using a clone. Julia packages are hosted on Git repositories and can be cloned using mechanisms supported by Git. The index of registered packages is maintained at METADATA.jl. For unofficial packages, we can use the following:

Pkg.clone("git://example.com/path/unofficialPackage/Package.jl.git") 

Sometimes unregistered packages have dependencies that require fulfilling before use. If that is the scenario, a REQUIRE file is needed at the top of the source tree of the unregistered package. The dependencies of the unregistered packages on the registered packages are determined by this REQUIRE file. When we run Pkg.clone(url), these dependencies are automatically installed.

Pkg.update() – package update

It's good to have updated packages.  Julia, which is under active development, has its packages frequently updated and new functionalities are added.

To update all of the packages, type the following:

Pkg.update() 

Under the hood, new changes are pulled into the METADATA file in the directory located at ~/.julia/v0.4/ and it checks for any new registered package versions which may have been published since the last update. If there are new registered package versions, Pkg.update() attempts to update the packages which are not dirty and are checked out on a branch. This update process satisfies the top-level requirements by computing the optimal set of package versions to be installed. The packages with specific versions that must be installed are defined in the REQUIRE file in Julia's directory (~/.julia/v0.4/).

METADATA repository

Registered packages are downloaded and installed using the official METADATA.jl repository. A different METADATA repository location can also be provided if required:

julia> Pkg.init("https://julia.customrepo.com/METADATA.jl.git", "branch") 

Developing packages

Julia allows us to view the source code and as it is tracked by Git, the full development history of all the installed packages is available. We can also make our desired changes and commit to our own repository, or do bug fixes and contribute enhancements upstream.

You may also want to create your own packages and publish them at some point in time. Julia's package manager allows you to do that too.

It is a requirement that Git is installed on the system and the developer needs an account at their hosting provider of choice (GitHub, Bitbucket, and so on). Having the ability to communicate over SSH is preferred—to enable that, upload your public ssh-key to your hosting provider.

Creating a new package

It is preferable to have the REQUIRE file in the package repository. This should have the bare minimum of a description of the Julia version.

For example, if we would like to create a new Julia package called HelloWorld we would have the following:

Pkg.generate("HelloWorld", "MIT") 

Here, HelloWorld is the package that we want to create and MIT is the license that our package will have. The license should be known to the package generator.

This will create a directory as follows: ~/.julia/v0.4/HelloWorld. The directory that is created is initialized as a Git repository. Also, all the files required by the package are kept in this directory. This directory is then committed to the repository.

This can now be pushed to the remote repository for the world to use.

You have been reading a chapter from
Julia for Data Science
Published in: Sep 2016 Publisher: Packt ISBN-13: 9781785289699
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $15.99/month. Cancel anytime}