Reader small image

You're reading from  Big Data Analytics with R

Product typeBook
Published inJul 2016
Reading LevelBeginner
PublisherPackt
ISBN-139781786466457
Edition1st Edition
Languages
Concepts
Right arrow
Author (1)
Simon Walkowiak
Simon Walkowiak
author image
Simon Walkowiak

Simon Walkowiak is a cognitive neuroscientist and a managing director of Mind Project Ltd a Big Data and Predictive Analytics consultancy based in London, United Kingdom. As a former data curator at the UK Data Service (UKDS, University of Essex) European largest socio-economic data repository, Simon has an extensive experience in processing and managing large-scale datasets such as censuses, sensor and smart meter data, telecommunication data and well-known governmental and social surveys such as the British Social Attitudes survey, Labour Force surveys, Understanding Society, National Travel survey, and many other socio-economic datasets collected and deposited by Eurostat, World Bank, Office for National Statistics, Department of Transport, NatCen and International Energy Agency, to mention just a few. Simon has delivered numerous data science and R training courses at public institutions and international companies. He has also taught a course in Big Data Methods in R at major UK universities and at the prestigious Big Data and Analytics Summer School organized by the Institute of Analytics and Data Science (IADS).
Read more about Simon Walkowiak

Right arrow

HDInsight - a multi-node Hadoop cluster on Azure


In Online Chapter , Pushing R Further (https://www.packtpub.com/sites/default/files/downloads/5396_6457OS_PushingRFurther.pdf), we briefly introduced you to HDInsight-a fully-managed Apache Hadoop service that comes as part of the Microsoft Azure platform and is specifically designed for heavy data crunching. In this section, we will deploy a multi-node HDInsight cluster with R and RStudio Server installed and will perform a number of MapReduce jobs on smart electricity meter readings (~414,000,000 cases, four variables, ~12 GB in size) of the Energy Demand Research Project available to download from UK Data Service's online Discover catalog at https://discover.ukdataservice.ac.uk/catalogue/?sn=7591 . But before we can tap into the actual data crunching, we need to set up and prepare an HDInsight cluster to process our data. The configuration of HDInsight is not the simplest task to accomplish, especially if we need to install additional...

lock icon
The rest of the page is locked
Previous PageNext Page
You have been reading a chapter from
Big Data Analytics with R
Published in: Jul 2016Publisher: PacktISBN-13: 9781786466457

Author (1)

author image
Simon Walkowiak

Simon Walkowiak is a cognitive neuroscientist and a managing director of Mind Project Ltd a Big Data and Predictive Analytics consultancy based in London, United Kingdom. As a former data curator at the UK Data Service (UKDS, University of Essex) European largest socio-economic data repository, Simon has an extensive experience in processing and managing large-scale datasets such as censuses, sensor and smart meter data, telecommunication data and well-known governmental and social surveys such as the British Social Attitudes survey, Labour Force surveys, Understanding Society, National Travel survey, and many other socio-economic datasets collected and deposited by Eurostat, World Bank, Office for National Statistics, Department of Transport, NatCen and International Energy Agency, to mention just a few. Simon has delivered numerous data science and R training courses at public institutions and international companies. He has also taught a course in Big Data Methods in R at major UK universities and at the prestigious Big Data and Analytics Summer School organized by the Institute of Analytics and Data Science (IADS).
Read more about Simon Walkowiak