Reader small image

You're reading from  Practical Big Data Analytics

Product typeBook
Published inJan 2018
Reading LevelIntermediate
PublisherPackt
ISBN-139781783554393
Edition1st Edition
Languages
Concepts
Right arrow
Author (1)
Nataraj Dasgupta
Nataraj Dasgupta
author image
Nataraj Dasgupta

Nataraj Dasgupta is the vice president of advanced analytics at RxDataScience Inc. Nataraj has been in the IT industry for more than 19 years, and has worked in the technical and analytics divisions of Philip Morris, IBM, UBS Investment Bank, and Purdue Pharma. At Purdue Pharma, Nataraj led the data science division, where he developed the company's award-winning big data and machine learning platform. Prior to Purdue, at UBS, he held the role of Associate Director, working with high-frequency and algorithmic trading technologies in the foreign exchange trading division of the bank.
Read more about Nataraj Dasgupta

Right arrow

Spark exercise - hands-on with Spark (Databricks)


This notebook is based on tutorials conducted by Databricks (https://databricks.com/). The tutorial will be conducted using the Databricks' Community Edition of Spark, available to sign up to at https://databricks.com/try-databricks. Databricks is a leading provider of the commercial and enterprise supported version of Spark.

In this tutorial, we will introduce a few basic commands used in Spark. Users are encouraged to try out more extensive Spark tutorials and notebooks that are available on the web for more detailed examples.

Documentation for Spark's Python API can be found at https://spark.apache.org/docs/latest/api/python/pyspark.html#pyspark.sql.

The data for this book was imported into the Databricks' Spark Platform. For more information on importing data, go to Importing Data - Databricks (https://docs.databricks.com/user-guide/importing-data.html).

# COMMAND ----------

# The SparkContext/SparkSession is the entry point for all Spark...
lock icon
The rest of the page is locked
Previous PageNext Page
You have been reading a chapter from
Practical Big Data Analytics
Published in: Jan 2018Publisher: PacktISBN-13: 9781783554393

Author (1)

author image
Nataraj Dasgupta

Nataraj Dasgupta is the vice president of advanced analytics at RxDataScience Inc. Nataraj has been in the IT industry for more than 19 years, and has worked in the technical and analytics divisions of Philip Morris, IBM, UBS Investment Bank, and Purdue Pharma. At Purdue Pharma, Nataraj led the data science division, where he developed the company's award-winning big data and machine learning platform. Prior to Purdue, at UBS, he held the role of Associate Director, working with high-frequency and algorithmic trading technologies in the foreign exchange trading division of the bank.
Read more about Nataraj Dasgupta