Reader small image

You're reading from  Practical Big Data Analytics

Product typeBook
Published inJan 2018
Reading LevelIntermediate
PublisherPackt
ISBN-139781783554393
Edition1st Edition
Languages
Concepts
Right arrow
Author (1)
Nataraj Dasgupta
Nataraj Dasgupta
author image
Nataraj Dasgupta

Nataraj Dasgupta is the vice president of advanced analytics at RxDataScience Inc. Nataraj has been in the IT industry for more than 19 years, and has worked in the technical and analytics divisions of Philip Morris, IBM, UBS Investment Bank, and Purdue Pharma. At Purdue Pharma, Nataraj led the data science division, where he developed the company's award-winning big data and machine learning platform. Prior to Purdue, at UBS, he held the role of Associate Director, working with high-frequency and algorithmic trading technologies in the foreign exchange trading division of the bank.
Read more about Nataraj Dasgupta

Right arrow

Tutorial - associative rules mining with CMS data


This tutorial will implement an interface for accessing rules created using the Apriori Package in R.

We'll be downloading data from the CMS OpenPayments website. The site hosts data on payments made to physicians and hospitals by companies:

The site provides various ways of downloading data. Users can select the dataset of interest and download it manually. In our case, we will download the data using one of the Web-based APIs that is available to all users.

Downloading the data

The dataset can be downloaded either at the Unix terminal (in the virtual machine) or by accessing the site directly from the browser. If you are downloading the dataset in the Virtual Machine, run the following command in the terminal window:

time wget -O cms2016_2.csv 'https://openpaymentsdata.cms.gov/resource/vq63-hu5i.csv?$query=select Physician_First_Name as firstName,Physician_Last_Name as lastName,Recipient_City as city,Recipient_State as state,Submitting_Applicable_Manufacturer_or_Applicable_GPO_Name...
lock icon
The rest of the page is locked
Previous PageNext Page
You have been reading a chapter from
Practical Big Data Analytics
Published in: Jan 2018Publisher: PacktISBN-13: 9781783554393

Author (1)

author image
Nataraj Dasgupta

Nataraj Dasgupta is the vice president of advanced analytics at RxDataScience Inc. Nataraj has been in the IT industry for more than 19 years, and has worked in the technical and analytics divisions of Philip Morris, IBM, UBS Investment Bank, and Purdue Pharma. At Purdue Pharma, Nataraj led the data science division, where he developed the company's award-winning big data and machine learning platform. Prior to Purdue, at UBS, he held the role of Associate Director, working with high-frequency and algorithmic trading technologies in the foreign exchange trading division of the bank.
Read more about Nataraj Dasgupta