Reader small image

You're reading from  Julia Cookbook

Product typeBook
Published inSep 2016
Reading LevelBeginner
Publisher
ISBN-139781785882012
Edition1st Edition
Languages
Concepts
Right arrow
Authors (2):
Jalem Raj Rohit
Jalem Raj Rohit
author image
Jalem Raj Rohit

Jalem Raj Rohit is an IIT Jodhpur graduate with a keen interest in recommender systems, machine learning, and serverless and distributed systems. Raj currently works as a senior consultantdata scienceand NLP at Episource, before which he worked at Zomato and Kayako. He contributes to open source projects in Python, Go, and Julia. He also speaks at tech conferences about serverless engineering and machine learning.
Read more about Jalem Raj Rohit

View More author details
Right arrow

Chapter 6. Parallel Computing

In this chapter, we will cover the following recipes:

  • Basic concepts of parallel computing

  • Data movement

  • Parallel map and loop operations

  • Channels

Introduction


In this chapter, you will learn about performing parallel computing and using it to handle big data. So, some concepts such as data movements, sharded arrays, and the Map-Reduce framework are important to know in order to handle large amounts of data by computing on it using parallelized CPUs. So, all the concepts discussed in this chapter will help you build good parallel computing and multiprocessing basics, including efficient data handling and code optimization.

Basic concepts of parallel computing


Parallel computing is a way of dealing with data in a parallel way. This can be done by connecting multiple computers as a cluster and using their CPUs to carry out the computations.

This style of computation is used when handling large amounts of data and also while running complex algorithms over significantly large data. The computations are executed faster due to the availability of multiple CPUs running them in parallel as well as the direct availability of RAM to each of them.

Getting ready

Julia has in-built support for parallel computing and multiprocessing. So, these computations rarely require any external libraries for the task.

How to do it...

  1. Julia can be started on your local computer using multiple cores of your CPU. So, we will now have multiple workers for the process. This is how you can fire up Julia in multi-processing mode in your terminal. This creates two worker process in the machine, which means it uses two CPU cores for the purpose...

Data movement


In parallel computing, data movements are quite common and should be minimized due to the time and the network overhead as a result of the movements. In this recipe, we will see how that can be optimized to avoid latency as much as we can.

Getting ready

To get ready for this recipe, you need to have the Julia REPL started in multiprocessing mode. This is explained in the Getting ready section in the preceding recipe.

How to do it...

  1. Firstly, we will see how to do a matrix computation using the @spawn macro, which helps in data movement. So, we construct a matrix of shape 200 x 200 and then try to square it using the @spawn macro. This can be done as follows:

    mat = rand(200, 200)
    exec_mat = @spawn mat^2
    fetch(exec_mat)
    

    The preceding command gives the following output:

  2. Now, we will look at an another way to achieve the same result. This time, we will use the @spawn macro directly instead of the initialization step. We will discuss the advantages and drawbacks of each method in the...

Parallel maps and loop operations


In this recipe, you will learn a bit about the famous Map-Reduce framework and why it is one of the most important ideas in the domains of big data and parallel computing. You will learn how to parallelize loops and use reducing functions on them through several CPUs and machines and you will further explore the concept of parallel computing, which you learned about in the previous recipes.

Getting ready

Just like the previous sections, Julia just needs to be running in multiprocessing mode to work through the following examples. This can be done through the instructions given in the first section.

How to do it...

  1. Firstly, we will write a function that takes and adds n random bits. The writing of this function has nothing to do with multiprocessing. So, it has simple Julia functions and loops. This function can be written as follows:

  2. Now, we will use the @spawn macro, which we learned about previously, to run the count_heads() function as separate processes...

Channels


Channels are like background plumbing for parallel computing in Julia. They are the reservoirs from which the individual processes access their data.

Getting ready

The requirements are similar to the previous sections. This is mostly a theoretical section, so you just need to run your experiments on your own. For that, you need to run your Julia REPL in a multiprocessing mode.

How to do it...

Channels are shared queues with a fixed length. They are common data reservoirs for the processes which are running.

The channels are like common data resources, which multiple readers or workers can access. They can access the data through the fetch() function, which we already discussed in the previous sections.

The workers can also write to the channel through the put!() function. This means that the workers can add more data to the resource, which can be accessed by all the workers running a particular computation.

Closing a channel after use is a good practice to avoid data corruption and unnecessary...

lock icon
The rest of the chapter is locked
You have been reading a chapter from
Julia Cookbook
Published in: Sep 2016Publisher: ISBN-13: 9781785882012
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
undefined
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $15.99/month. Cancel anytime

Authors (2)

author image
Jalem Raj Rohit

Jalem Raj Rohit is an IIT Jodhpur graduate with a keen interest in recommender systems, machine learning, and serverless and distributed systems. Raj currently works as a senior consultantdata scienceand NLP at Episource, before which he worked at Zomato and Kayako. He contributes to open source projects in Python, Go, and Julia. He also speaks at tech conferences about serverless engineering and machine learning.
Read more about Jalem Raj Rohit