Packt+ | Advance your knowledge in tech

You're reading from Jupyter Cookbook Over 75 recipes to perform interactive computing across Python, R, Scala, Spark, JavaScript, and more

Product type Paperback

Published in Apr 2018

Last Updated in Feb 2025

Publisher Packt

ISBN-13 9781788839440

Length 238 pages

Edition 1st Edition

Languages

JavaScript

Tools

Jupyter

Concepts

Data Analysis

Author (1):

Toomey

View More author details

Table of Contents (12) Chapters

Preface

1. Installation and Setting up the Environment FREE CHAPTER

2. Adding an Engine

3. Accessing and Retrieving Data

4. Visualizing Your Analytics

5. Working with Widgets

6. Jupyter Dashboards

7. Sharing Your Code

8. Multiuser Jupyter

9. Interacting with Big Data

10. Jupyter Security

11. Jupyter Labs

Computing prime numbers using parallel operations

A good method for determining whether a number is prime or not is Eratosthenes's sieve. For each number, we check whether it fits the bill for a prime (if it meets the criteria for a prime, it will filter through the sieve).

The series of tests are run on every number we check for prime. This is a great usage for parallel operations. Spark has the in-built ability to split up a task among the threads/machines available. The threads are configured through the SparkContext (we see that in every example).

In our case, we split up the workload among the available threads, each taking a set of numbers to check, and collect the results later on.

How to do it...

We can use a script like this:

import pyspark
if not 'sc' in globals():
    sc = pyspark.SparkContext()

#check if a number is prime
def isprime(n):
    # must be positive
    n = abs(int(n))

    # 2 or more
    if n < 2:
        return False

    # 2 is the only even prime number
    if...