Search icon CANCEL
Cart icon
Close icon
You have no products in your basket yet
Save more on your purchases!
Savings automatically calculated. No voucher code required
Arrow left icon
All Products
Best Sellers
New Releases
Learning Hub
Free Learning
Arrow right icon
Save more on purchases! Buy 2 and save 10%, Buy 3 and save 15%, Buy 5 and save 20%
Julia for Data Science
Julia for Data Science

Julia for Data Science: high-performance computing simplified

By Anshul Joshi
$43.99 $9.99
Book Sep 2016 346 pages 1st Edition
$43.99 $9.99
$15.99 Monthly
$43.99 $9.99
$15.99 Monthly

What do you get with eBook?

Product feature icon Instant access to your Digital eBook purchase
Product feature icon Download this book in EPUB and PDF formats
Product feature icon Access this title in our online reader with advanced features
Product feature icon DRM FREE - Read whenever, wherever and however you want
Buy Now
Table of content icon View table of contents Preview book icon Preview Book

Julia for Data Science

Chapter 1. The Groundwork – Julia's Environment

Julia is a fairly young programming language. In 2009, three developers (Stefan Karpinski, Jeff Bezanson, and Viral Shah) at MIT in the Applied Computing group under the supervision of Prof. Alan Edelman started working on a project that lead to Julia. In February 2012, Julia was presented publicly and became open source. The source code is available on GitHub ( The source of the registered packages can also be found on GitHub. Currently, all four of the initial creators, along with developers from around the world, actively contribute to Julia.


The current release is 0.4 and is still away from its 1.0 release candidate.

Based on solid principles, its popularity is steadily increasing in the field of scientific computing, data science, and high-performance computing.

This chapter will guide you through the download and installation of all the necessary components of Julia. This chapter covers the following topics:

  • How is Julia different?

  • Setting up Julia's environment.

  • Using Julia's shell and REPL.

  • Using Jupyter notebooks

  • Package management

  • Parallel computation

  • Multiple dispatch

  • Language interoperability

Traditionally, the scientific community has used slower dynamic languages to build their applications, although they have required the highest computing performance. Domain experts who had experience with programming, but were not generally seasoned developers, always preferred dynamic languages over statically typed languages.

Julia is different

Over the years, with the advancement in compiler techniques and language design, it is possible to eliminate the trade-off between performance and dynamic prototyping. So, the scientific computing required was a good dynamic language like Python together with performance like C. And then came Julia, a general purpose programming language designed according to the requirements of scientific and technical computing, providing performance comparable to C/C++, and with an environment productive enough for prototyping like the high-level dynamic language of Python. The key to Julia's performance is its design and Low Level Virtual Machine (LLVM) based Just-in-Time compiler which enables it to approach the performance of C and Fortran.

The key features offered by Julia are:

  • A general purpose high-level dynamic programming language designed to be effective for numerical and scientific computing

  • A Low-Level Virtual Machine (LLVM) based Just-in-Time (JIT) compiler that enables Julia to approach the performance of statically-compiled languages like C/C++

The following quote is from the development team of Julia—Jeff Bezanson, Stefan Karpinski, Viral Shah, and Alan Edelman:


We are greedy: we want more.

We want a language that's open source, with a liberal license. We want the speed of C with the dynamism of Ruby. We want a language that's homoiconic, with true macros like Lisp, but with obvious, familiar mathematical notation like Matlab. We want something as usable for general programming as Python, as easy for statistics as R, as natural for string processing as Perl, as powerful for linear algebra as Matlab, as good at gluing programs together as the shell. Something that is dirt simple to learn, yet keeps the most serious hackers happy. We want it interactive and we want it compiled.

(Did we mention it should be as fast as C?)

It is quite often compared with Python, R, MATLAB, and Octave. These have been around for quite some time and Julia is highly influenced by them, especially when it comes to numerical and scientific computing. Although Julia is really good at it, it is not restricted to just scientific computing as it can also be used for web and general purpose programming.

The development team of Julia aims to create a remarkable and never done before combination of power and efficiency without compromising the ease of use in one single language. Most of Julia's core is implemented in C/C++. Julia's parser is written in Scheme. Julia's efficient and cross-platform I/O is provided by the Node.js's libuv.

Features and advantages of Julia can be summarized as follows:

  • It's designed for distributed and parallel computation.

  • Julia provides an extensive library of mathematical functions with great numerical accuracy.

  • Julia gives the functionality of multiple dispatch. Multiple dispatch refers to using many combinations of argument types to define function behaviors.

  • The Pycall package enables Julia to call Python functions in its code and Matlab packages using Matlab.jl. Functions and libraries written in C can also be called directly without any need for APIs or wrappers.

  • Julia provides powerful shell-like capabilities for managing other processes in the system.

  • Unlike other languages, user-defined types in Julia are compact and quite fast as built-ins.

  • Data analysis makes great use of vectorized code to gain performance benefits. Julia eliminates the need to vectorize code to gain performance. De-vectorized code written in Julia can be as fast as vectorized code.

  • It uses lightweight "green" threading also known as tasks or coroutines, cooperative multitasking, or one-shot continuations.

  • Julia has a powerful type system. The conversions provided are elegant and extensible.

  • It has efficient support for Unicode.

  • It has facilities for metaprogramming and Lisp-like macros.

  • It has a built-in package manager. (Pkg)

  • Julia provides efficient, specialized and automatic generation of code for different argument types.

  • It's free and open source with an MIT license.

Setting up the environment

Julia is available free. It can be downloaded from its website at the following address: The website also has exhaustive documentation, examples, and links to tutorials and community. The documentation can be downloaded in popular formats.

Installing Julia (Linux)

Ubuntu/Linux Mint is one of the most famous Linux distros, and their deb packages of Julia are also provided. These are available for both 32-bit and 64-bit distributions.

To install Julia, add the PPA (personal package archive). Ubuntu users are privileged enough to have PPA. It is treated as an apt repository to build and publish Ubuntu source packages. In the terminal, type the following:

sudo apt-get add-repository ppa:staticfloat/juliareleases 
sudo apt-get update 

This adds the PPA and updates the package index in the repository.

Now install Julia:

sudo apt-get install Julia  

The installation is complete. To check if the installation is successful in the Terminal type in the following:

julia --version 

This gives the installed Julia's version.

To open the Julia's interactive shell, type julia into the Terminal. To uninstall Julia, simply use apt to remove it:

sudo apt-get remove julia 

For Fedora/RHEL/CentOS or distributions based on them, enable the EPEL repository for your distribution version. Then, click on the link provided. Enable Julia's repository using the following:

dnf copr enable nalimilan/julia

Or copy the relevant .repo file available as follows:


Finally, in the Terminal type the following:

yum install julia

Installing Julia (Mac)

Users with Mac OS X need to click on the downloaded .dmg file to run the disk image. After that, drag the app icon into the Applications folder. It may prompt you to ask if you want to continue as the source has been downloaded from the Internet and so is not considered secure. Click on continue if it is downloaded for the Julia language official website.

Julia can also be installed using homebrew on the Mac as follows:

brew update 
brew tap staticfloat/julia 
brew install julia 

The installation is complete. To check if the installation is successful in the Terminal, type the following:

julia --version 

This gives you the installed Julia version.

Installing Julia (Windows)

Download the .exe file provided on the download page according to your system's configuration (32-bit/64-bit). Julia is installed on Windows by running the downloaded .exe file, which will extract Julia into a folder. Inside this folder is a batch file called julia.bat, which can be used to start the Julia console.

To uninstall, delete the Julia folder.

Exploring the source code

For enthusiasts, Julia's source code is available and users are encouraged to contribute by adding features or by bug fixing. This is the directory structure of the tree:


Source code for Julia's standard library


Editor support for Julia source, miscellaneous scripts


External dependencies


Source for the user manual


Source for standard library function help text


Example Julia programs


Source for Julia language core


Test suites


Benchmark suites


Source for various frontends


Binaries and shared libraries loaded by Julia's standard libraries

Using REPL

Read-Eval-Print-Loop is an interactive shell or the language shell that provides the functionality to test out pieces of code. Julia provides an interactive shell with a Just-in-Time compiler at the backend. We can give inputs in a line, it is compiled and evaluated, and the result is given in the next line.

The benefit of using the REPL is that we can test out our code for possible errors. Also, it is a good environment for beginners. We can type in the expressions and press Enter to evaluate.

A Julia library, or custom-written Julia program, can be included in the REPL using include. For example, I have a file called hello.jl, which I will include in the REPL by doing the following:

julia> include ("hello.jl") 

Julia also stores all the commands written in the REPL in the .julia_history. This file is located at /home/$USER on Ubuntu, C:\Users\username on Windows, or ~/.julia_history on OS X.

As with a Linux Terminal, we can reverse-search using Ctrl + R in Julia's shell. This is a really nice feature as we can go back in the history of typed commands.

Typing ? in the language shell will change the prompt to:


To clear the screen, press Ctrl + L. To come out of the REPL press Ctrl + D or type the following:

julia> exit().

Using Jupyter Notebook

Data science and scientific computing are privileged to have an amazing interactive tool called Jupyter Notebook. With Jupyter Notebook you can to write and run code in an interactive web environment, which also has the capability to have visualizations, images, and videos. It makes testing of equations and prototyping a lot easier. It has the support of over 40 programming languages and is completely open source.

GitHub supports Jupyter notebooks. The notebook with the record of computation can be shared via the Jupyter notebook viewer or other cloud storage. Jupyter notebooks are extensively used for coding machine-learning algorithms, statistical modeling and numerical simulation, and data munging.

Jupyter Notebook is implemented in Python but you can run the code in any of the 40 languages provided you have their kernel. You can check if Python is installed on your system or not by typing the following into the Terminal:

python -version 

This will give the version of Python if it is there on the system. It is best to have Python 2.7.x or 3.5.x or a later version.

If Python is not installed then you can install it by downloading it from the official website for Windows. For Linux, typing the following should work:

sudo apt-get install python 

It is highly recommended to install Anaconda if you are new to Python and data science. Commonly used packages for data science, numerical, and scientific computing including Jupyter notebook come bundled with Anaconda making it the preferred way to set up the environment. Instructions can be found at

Jupyter is present in the Anaconda package, but you can check if the Jupyter package is up to date by typing in the following:

conda install jupyter 

Another way to install Jupyter is by using pip:

pip install jupyter 

To check if Jupyter is installed properly, type the following in the Terminal:

jupyter -version 

It should give the version of the Jupyter if it is installed.

Now, to use Julia with Jupyter we need the IJulia package. This can be installed using Julia's package manager.

After installing IJulia, we can create a new notebook by selecting Julia under the Notebooks section in Jupyter.

To get the latest version of all your packages, in Julia's shell type the following:

julia> Pkg.update() 

After that add the IJulia package by typing the following:

julia> Pkg.add("IJulia") 

In Linux, you may face some warnings, so it's better to build the package:


After IJulia is installed, come back to the Terminal and start the Jupyter notebook:

jupyter notebook 

A browser window will open. Under New, you will find options to create new notebooks with the kernels already installed. As we want to start a Julia notebook we will select Julia 0.4.2. This will start a new Julia notebook. You can try out a simple example.

In this example, we are creating a histogram of random numbers. This is just an example we will be studying the components used in detail in coming chapters.

Popular editors such as Atom and Sublime have a plugin for Julia. Atom has language—julia and Sublime has Sublime—IJulia, both of which can be downloaded from their package managers.

Package management

Julia provides a built-in package manager. Using Pkg we can install libraries written in Julia. For external libraries, we can also compile them from their source or use the standard package manager of the operating system. A list of registered packages is maintained at

Pkg is provided in the base installation. The Pkg module contains all the package manager commands.

Pkg.status() – package status

The Pkg.status() is a function that prints out a list of currently installed packages with a summary. This is handy when you need to know if the package you want to use is installed or not.

When the Pkg command is run for the first time, the package directory is automatically created. It is required by the command that the Pkg.status() returns a valid list of the packages installed. The list of packages given by the Pkg.status() are of registered versions which are managed by Pkg.

Pkg.installed() can also be used to return a list of all the installed packages with their versions.

Pkg.add() – adding packages

Julia's package manager is declarative and intelligent. You only have to tell it what you want and it will figure out what version to install and will resolve dependencies if there are any. Therefore, we only need to add the list of requirements that we want and it resolves which packages and their versions to install.

The ~/.julia/v0.4/REQUIRE file contains the package requirements. We can open it using a text editor such as vi or atom, or use Pkg.edit() in Julia's shell to edit this file. After editing the file, run Pkg.resolve() to install or remove the packages.

We can also use Pkg.add(package_name) to add packages and Pkg.rm(package_name) to remove packages. Earlier, we used Pkg.add("IJulia")  to install the IJulia package.

When we don't want to have a package installed on our system anymore, Pkg.rm() is used for removing the requirement from the REQUIRE file. Similar to Pkg.add(), Pkg.rm() first removes the requirement of the package from the REQUIRE file and then updates the list of installed packages by running Pkg.resolve() to match.

Working with unregistered packages

Frequently, we would like to be able to use packages created by our team members or someone who has published on Git but they are not in the registered packages of Pkg. Julia allows us to do that by using a clone. Julia packages are hosted on Git repositories and can be cloned using mechanisms supported by Git. The index of registered packages is maintained at METADATA.jl. For unofficial packages, we can use the following:


Sometimes unregistered packages have dependencies that require fulfilling before use. If that is the scenario, a REQUIRE file is needed at the top of the source tree of the unregistered package. The dependencies of the unregistered packages on the registered packages are determined by this REQUIRE file. When we run Pkg.clone(url), these dependencies are automatically installed.

Pkg.update() – package update

It's good to have updated packages.  Julia, which is under active development, has its packages frequently updated and new functionalities are added.

To update all of the packages, type the following:


Under the hood, new changes are pulled into the METADATA file in the directory located at ~/.julia/v0.4/ and it checks for any new registered package versions which may have been published since the last update. If there are new registered package versions, Pkg.update() attempts to update the packages which are not dirty and are checked out on a branch. This update process satisfies the top-level requirements by computing the optimal set of package versions to be installed. The packages with specific versions that must be installed are defined in the REQUIRE file in Julia's directory (~/.julia/v0.4/).

METADATA repository

Registered packages are downloaded and installed using the official METADATA.jl repository. A different METADATA repository location can also be provided if required:

julia> Pkg.init("", "branch") 

Developing packages

Julia allows us to view the source code and as it is tracked by Git, the full development history of all the installed packages is available. We can also make our desired changes and commit to our own repository, or do bug fixes and contribute enhancements upstream.

You may also want to create your own packages and publish them at some point in time. Julia's package manager allows you to do that too.

It is a requirement that Git is installed on the system and the developer needs an account at their hosting provider of choice (GitHub, Bitbucket, and so on). Having the ability to communicate over SSH is preferred—to enable that, upload your public ssh-key to your hosting provider.

Creating a new package

It is preferable to have the REQUIRE file in the package repository. This should have the bare minimum of a description of the Julia version.

For example, if we would like to create a new Julia package called HelloWorld we would have the following:

Pkg.generate("HelloWorld", "MIT") 

Here, HelloWorld is the package that we want to create and MIT is the license that our package will have. The license should be known to the package generator.

This will create a directory as follows: ~/.julia/v0.4/HelloWorld. The directory that is created is initialized as a Git repository. Also, all the files required by the package are kept in this directory. This directory is then committed to the repository.

This can now be pushed to the remote repository for the world to use.

Parallel computation using Julia

Advancement in modern computing has led to multi-core CPUs in systems and sometimes these systems are combined together in a cluster capable of performing a task which a single system might not be able to perform alone, or if it did it would take an undesirable amount of time. Julia's environment of parallel processing is based on message passing. Multiple processes are allowed for programs in separate memory domains.

Message passing is implemented differently in Julia from other popular environments such as MPI. Julia provides one-sided communication, therefore the programmer explicitly manages only one process in the two-process operation.

Julia's parallel programming paradigm is built on the following:

  • Remote references

  • Remote calls

A request to run a function on another process is called a remote call. The reference to an object by another object on a particular process is called a remote reference. A remote reference is a construct used in most distributed object systems. Therefore, a call which is made with some specific arguments to the objects generally on a different process by the objects of the different process is called the remote call and this will return a reference to the remote object which is called the remote reference.

The remote call returns a remote reference to its result. Remote calls return immediately. The process that made the call proceeds to its next operation. Meanwhile, the remote call happens somewhere else. A call to wait() on its remote reference waits for the remote call to finish. The full value of the result can be obtained using fetch(), and put!() is used to store the result to a remote reference.

Julia uses a single process default. To start Julia with multiple processors use the following:

julia -p n

where n is the number of worker processes. Alternatively, it is possible to create extra processors from a running system by using addproc(n). It is advisable to put n equal to the number of the CPU cores in the system.

pmap and @parallel are the two most frequently used and useful functions.

Julia provides a parallel for loop, used to run a number of processes in parallel. This is used as follows.

Parallel for loop works by having multiple processes assigned iterations and then reducing the result (in this case (+)). It is somewhat similar to the map-reduce concept. Iterations will run independently over different processes and the results obtained by these processes will be combined at the end (like map-reduce). The resultant of one loop can also become the feeder for the other loop. The answer is the resultant of this whole parallel loop.

It is very different than a normal iterative loop because the iterations do not take place in a specified sequence. As the iterations run on different processes, any writes that happens on variables or arrays are not globally visible. The variables used are copied and broadcasted to each process of the parallel for loop.

For example:

arr = zeros(500000) 
@parallel for i=1:500000 
  arr[i] = i 

This will not give the desired result as each process gets their own separate copy of arr. The vector will not be filled in with i as expected. We must avoid such parallel for loops.

pmap refers to parallel map. For example:

This code solves the problem if we have a number of large random matrices and we are required to obtain the singular values, in parallel.

Julia's pmap() is designed differently. It is well suited for cases where a large amount of work is done by each function call, whereas @parallel is suited for handling situations which involve numerous small iterations. Both pmap() and @parallel for utilize worker nodes for parallel computation. However, the node from which the calling process originated does the final reduction in @parallel for.

Julia's key feature – multiple dispatch

A function is an object, mapping a tuple of arguments using some expression to a return value. When this function object is unable to return a value, it throws an exception. For different types of arguments the same conceptual function can have different implementations. For example, we can have a function to add two floating point numbers and another function to add two integers. But conceptually, we are only adding two numbers. Julia provides a functionality by which different implementations of the same concept can be implemented easily. The functions don't need to be defined all at once. They are defined in small abstracts. These small abstracts are different argument type combinations and have different behaviors associated with them. The definition of one of these behaviors is called a method.

The types and the number of arguments that a method definition accepts is indicated by the annotation of its signatures. Therefore, the most suitable method is applied whenever a function is called with a certain set of arguments. To apply a method when a function is invoked is known as dispatch. Traditionally, object-oriented languages consider only the first argument in dispatch. Julia is different as all of the function's arguments are considered (not just only the first) and then it choses which method should be invoked. This is known as multiple dispatch.

Multiple dispatch is particularly useful for mathematical and scientific code. We shouldn't consider that the operations belong to one argument more than any of the others. All of the argument types are considered when implementing a mathematical operator. Multiple dispatch is not limited to mathematical expressions as it can be used in numerous real-world scenarios and is a powerful paradigm for structuring the programs.

Methods in multiple dispatch

+ is a function in Julia using multiple dispatch. Multiple dispatch is used by all of Julia's standard functions and operators. For various possible combinations of argument types and count, all of them have many methods defining their behavior. A method is restricted to take certain types of arguments using the :: type-assertion operator:

julia> f(x::Float64, y::Float64) = x + y 

The function definition will only be applied for calls where x and y are both values of type Float64:

julia> f(10.0, 14.0) 

If we try to apply this definition to other types of arguments, it will give a method error.

The arguments must be of precisely the same type as defined in the function definition.

The function object is created in the first method definition. New method definitions add new behaviors to the existing function object. When a function is invoked, the number and types of the arguments are matched, and the most specific method definition matching will be executed.

The following example creates a function with two methods. One method definition takes two arguments of the type Float64 and adds them. The second method definition takes two arguments of the type Number, multiplies them by two and adds them. When we invoke the function with Float64 arguments, then the first method definition is applied, and when we invoke the function with Integer arguments, the second method definition is applied as the number can take any numeric values. In the following example, we are playing with floating point numbers and integers using multiple dispatch.

In Julia, all values are instances of the abstract type "Any". When the type declaration is not given with ::, that means it is not specifically defined as the type of the argument, therefore Any is the default type of method parameter and it doesn't have the restriction of taking any type of value. Generally, one method definition is written in such a way that it will be applied to the certain arguments to which no other method definition applies. It is one of the Julia language's most powerful features.

It is efficient with a great ease of expressiveness to generate specialized code and implement complex algorithms without caring much about the low-level implementation using Julia's multiple dispatch and flexible parametric type system.

Ambiguities – method definitions

Sometimes function behaviors are defined in such a way that there isn't a unique method to apply for a certain set of arguments. Julia throws a warning in such cases about this ambiguity, but proceeds by arbitrarily picking a method. To avoid this ambiguity we should define a method to handle such cases.

In the following example, we define a method definition with one argument of the type Any and another argument of the type Float64. In the second method definition, we just changed the order, but this doesn't differentiate it from the first definition. In this case, Julia will give a warning of ambiguous method definition but will allow us to proceed.

Facilitating language interoperability

Although Julia can be used to write most kinds of code, there are mature libraries for numerical and scientific computing which we would like to exploit. These libraries can be in C, Fortran or Python. Julia allows the ease of using the existing code written in Python, C, or Fortran. This is done by making Julia perform simple and efficient-to-call C, Fortran, or Python functions.

The C/Fortran libraries should be available to Julia. An ordinary but valid call with ccall is made to this code. This is possible when the code is available as a shared library. Julia's JIT generates the same machine instructions as the native C call. Therefore, it is generally no different from calling through a C code with a minimal overhead.

Importing Python code can be beneficial and sometimes needed, especially for data science, because it already has an exhaustive library of implementations of machine learning and statistical functions. For example, it contains scikit-learn and pandas. To use Python in Julia, we require PyCall.jl. To add PyCall.jl do the following:


PyCall contains a macro @pyimport that facilitates importing Python packages and provides Julia wrappers for all of the functions and constants therein, including automatic conversion of types between Julia and Python.

PyCall also provides functionalities for lower-level manipulation of Python objects, including a PyObject type for opaque Python objects. It also has a pycall function (similar to Julia's ccall function), which can be used in Julia to call Python functions with type conversions. PyCall does not use the Python program but links directly to the libpython library. During the, it finds the location of the libpython by Punning python.

Calling Python code in Julia

The @pyimport macro automatically makes the appropriate type conversions to Julia types in most of the scenarios based on a runtime inspection of the Python objects. It achieves better control over these type conversions by using lower-level functions. Using PyCall in scenarios where the return type is known can help in improving the performance, both by eliminating the overhead of runtime type inference, and also by providing more type information to the Julia compiler:

  • pycall(function::PyObject, returntype::Type, args...): This calls the given Python function (typically looked up from a module) with the given args... (of standard Julia types which are converted automatically to the corresponding Python types if possible), converting the return value to returntype (use a returntype of PyObject to return the unconverted Python object reference, or PyAny to request an automated conversion).

  • pyimport(s): This imports the Python modules (a string or symbol) and returns a pointer to it (a PyObject). Functions or other symbols in the module may then be looked up by s[name] where the name is a string (for the raw PyObject) or a symbol (for automatic type conversion). Unlike the @pyimport macro, this does not define a Julia module and members cannot be accessed with an


In this chapter, we learned how Julia is different and how an LLVM-based JIT compiler enables Julia to approach the performance of C/C++. We introduced you to how to download Julia, install it, and build it from source. The notable features that we found were that the language is elegant, concise, and powerful and it has amazing capabilities for numeric and scientific computing.

We worked on some examples of working with Julia via the command line (REPL) and saw how full of features the language shell is. The features found were tab-completion, reverse-search, and help functions. We also discussed why should we use Jupyter Notebook and went on to set up Jupyter with the IJulia package. We worked on a simple example to use the Jupyter Notebook and Julia's visualization package, Gadfly.

In addition, we learned about Julia's powerful built-in package management and how to add, update, and remove modules. Also, we went through the process of creating our own package and publishing it to the community. We also introduced you to one of the most powerful features of Julia—multiple dispatch—and worked on some basic examples of how to create method definitions to implement multiple dispatch.

In addition, we introduced you to the parallel computation, explaining how it is different from conventional message passing and how to make use of all the compute resources available. We also learned Julia's feature of language interoperability and how we can call a Python module or a library from the Julia program.

Left arrow icon Right arrow icon
Download code icon Download Code

Key benefits

  • An in-depth exploration of Julia's growing ecosystem of packages
  • Work with the most powerful open-source libraries for deep learning, data wrangling, and data visualization
  • Learn about deep learning using Mocha.jl and give speed and high performance to data analysis on large data sets


Julia is a fast and high performing language that's perfectly suited to data science with a mature package ecosystem and is now feature complete. It is a good tool for a data science practitioner. There was a famous post at Harvard Business Review that Data Scientist is the sexiest job of the 21st century. ( This book will help you get familiarised with Julia's rich ecosystem, which is continuously evolving, allowing you to stay on top of your game. This book contains the essentials of data science and gives a high-level overview of advanced statistics and techniques. You will dive in and will work on generating insights by performing inferential statistics, and will reveal hidden patterns and trends using data mining. This has the practical coverage of statistics and machine learning. You will develop knowledge to build statistical models and machine learning systems in Julia with attractive visualizations. You will then delve into the world of Deep learning in Julia and will understand the framework, Mocha.jl with which you can create artificial neural networks and implement deep learning. This book addresses the challenges of real-world data science problems, including data cleaning, data preparation, inferential statistics, statistical modeling, building high-performance machine learning systems and creating effective visualizations using Julia.

What you will learn

[*]Apply statistical models in Julia for data-driven decisions [*]Understanding the process of data munging and data preparation using Julia [*]Explore techniques to visualize data using Julia and D3 based packages [*]Using Julia to create self-learning systems using cutting edge machine learning algorithms [*]Create supervised and unsupervised machine learning systems using Julia. Also, explore ensemble models [*]Build a recommendation engine in Julia [*]Dive into Julia’s deep learning framework and build a system using Mocha.jl

Product Details

Country selected

Publication date : Sep 30, 2016
Length 346 pages
Edition : 1st Edition
Language : English
ISBN-13 : 9781785289699
Category :
Languages :
Concepts :

What do you get with eBook?

Product feature icon Instant access to your Digital eBook purchase
Product feature icon Download this book in EPUB and PDF formats
Product feature icon Access this title in our online reader with advanced features
Product feature icon DRM FREE - Read whenever, wherever and however you want
Buy Now

Product Details

Publication date : Sep 30, 2016
Length 346 pages
Edition : 1st Edition
Language : English
ISBN-13 : 9781785289699
Category :
Languages :
Concepts :

Table of Contents

17 Chapters
Julia for Data Science Chevron down icon Chevron up icon
Credits Chevron down icon Chevron up icon
About the Author Chevron down icon Chevron up icon
About the Reviewer Chevron down icon Chevron up icon Chevron down icon Chevron up icon
Preface Chevron down icon Chevron up icon
1. The Groundwork – Julia's Environment Chevron down icon Chevron up icon
2. Data Munging Chevron down icon Chevron up icon
3. Data Exploration Chevron down icon Chevron up icon
4. Deep Dive into Inferential Statistics Chevron down icon Chevron up icon
5. Making Sense of Data Using Visualization Chevron down icon Chevron up icon
6. Supervised Machine Learning Chevron down icon Chevron up icon
7. Unsupervised Machine Learning Chevron down icon Chevron up icon
8. Creating Ensemble Models Chevron down icon Chevron up icon
9. Time Series Chevron down icon Chevron up icon
10. Collaborative Filtering and Recommendation System Chevron down icon Chevron up icon
11. Introduction to Deep Learning Chevron down icon Chevron up icon

Customer reviews

Top Reviews
Rating distribution
Empty star icon Empty star icon Empty star icon Empty star icon Empty star icon 0
(0 Ratings)
5 star 0%
4 star 0%
3 star 0%
2 star 0%
1 star 0%
Filter icon Filter
Top Reviews

Filter reviews by

No reviews found
Get free access to Packt library with over 7500+ books and video courses for 7 days!
Start Free Trial


How do I buy and download an eBook? Chevron down icon Chevron up icon

Where there is an eBook version of a title available, you can buy it from the book details for that title. Add either the standalone eBook or the eBook and print book bundle to your shopping cart. Your eBook will show in your cart as a product on its own. After completing checkout and payment in the normal way, you will receive your receipt on the screen containing a link to a personalised PDF download file. This link will remain active for 30 days. You can download backup copies of the file by logging in to your account at any time.

If you already have Adobe reader installed, then clicking on the link will download and open the PDF file directly. If you don't, then save the PDF file on your machine and download the Reader to view it.

Please Note: Packt eBooks are non-returnable and non-refundable.

Packt eBook and Licensing When you buy an eBook from Packt Publishing, completing your purchase means you accept the terms of our licence agreement. Please read the full text of the agreement. In it we have tried to balance the need for the ebook to be usable for you the reader with our needs to protect the rights of us as Publishers and of our authors. In summary, the agreement says:

  • You may make copies of your eBook for your own use onto any machine
  • You may not pass copies of the eBook on to anyone else
How can I make a purchase on your website? Chevron down icon Chevron up icon

If you want to purchase a video course, eBook or Bundle (Print+eBook) please follow below steps:

  1. Register on our website using your email address and the password.
  2. Search for the title by name or ISBN using the search option.
  3. Select the title you want to purchase.
  4. Choose the format you wish to purchase the title in; if you order the Print Book, you get a free eBook copy of the same title. 
  5. Proceed with the checkout process (payment to be made using Credit Card, Debit Cart, or PayPal)
Where can I access support around an eBook? Chevron down icon Chevron up icon
  • If you experience a problem with using or installing Adobe Reader, the contact Adobe directly.
  • To view the errata for the book, see and view the pages for the title you have.
  • To view your account details or to download a new copy of the book go to
  • To contact us directly if a problem is not resolved, use
What eBook formats do Packt support? Chevron down icon Chevron up icon

Our eBooks are currently available in a variety of formats such as PDF and ePubs. In the future, this may well change with trends and development in technology, but please note that our PDFs are not Adobe eBook Reader format, which has greater restrictions on security.

You will need to use Adobe Reader v9 or later in order to read Packt's PDF eBooks.

What are the benefits of eBooks? Chevron down icon Chevron up icon
  • You can get the information you need immediately
  • You can easily take them with you on a laptop
  • You can download them an unlimited number of times
  • You can print them out
  • They are copy-paste enabled
  • They are searchable
  • There is no password protection
  • They are lower price than print
  • They save resources and space
What is an eBook? Chevron down icon Chevron up icon

Packt eBooks are a complete electronic version of the print edition, available in PDF and ePub formats. Every piece of content down to the page numbering is the same. Because we save the costs of printing and shipping the book to you, we are able to offer eBooks at a lower cost than print editions.

When you have purchased an eBook, simply login to your account and click on the link in Your Download Area. We recommend you saving the file to your hard drive before opening it.

For optimal viewing of our eBooks, we recommend you download and install the free Adobe Reader version 9.