Search icon
Arrow left icon
All Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Newsletters
Free Learning
Arrow right icon
Getting Started with Haskell Data Analysis

You're reading from  Getting Started with Haskell Data Analysis

Product type Book
Published in Oct 2018
Publisher Packt
ISBN-13 9781789802863
Pages 160 pages
Edition 1st Edition
Languages
Author (1):
James Church James Church
Profile icon James Church

SQLite3

In this chapter, we are going to learn about SQLite3. SQLite3 is a file format for storing data in the same mindset of a relational database such as Oracle, MySQL, MariaDB, and Postgres. Well, I suppose that you could use SQLite3 as a traditional database engine, but it's really not made for that. SQLite3 allows us to open up a file, work with the data in that file using SQL statements, and then close that file. One file can contain one database, and a database can store multiple tables of information. Contrast this with the CSV file format that we used in the last chapter which, at best, stores a single table of information. The advantages of SQLite3 are that it doesn't require any server-side programs running in the background, there are no configurations to discuss, and the executable for working with SQLite3 is a single file. So, in my opinion, working with...

SQLite3 command line

This section is going to be a primer on SQLite3 and it won't have any Haskell code. We're going to take a moment, and translate a CSV file into SQLite3. In this section, we're going to take a look at introducing SQLite3; we will be creating a table in an SQLite3 database, and also adding a CSV file to that table that we created in our SQLite3 database.

So, let's go to our Haskell environment and open our browser. Using Google, search for usgs earthquake feed csv. USG is the United States Geological Survey, and they keep a database of every single earthquake that takes place on planet earth, and they offer this data in a CSV file. So, we're going to click that very first link, https://earthquake.usgs.gov/earthquakes/feed/v1.0/csv.php. You should see Spreadsheet Format at the top; scroll down to the heading where it says Past 7 Days...

Working with SQLite3 and Haskell

In this section, we will talk about getting data from SQLite3 into Haskell. We're going to understand the basic types within SQLite3 and their Haskell counterparts. We're also going to be installing the necessary software in order to get Haskell and SQLite3 to communicate with each other, and we're going to be writing a few SELECT queries within Haskell.

There are a few different data types in SQLite3 with their Haskell counterparts:

The four primary types are INTEGER, REAL, TEXT, and BLOB; TEXT and BLOB are almost the same types. One is for raw data and the other is for string data, but we can interpret both of those in Haskell as String. INTEGER corresponds to Integer; REAL corresponds to Decimal. There is a fifth type in SQLite3 called NUMERIC, which is adaptive and is treated as an integer with INTEGER data and a real with...

Slices of data

This section will be an overview of the versatility of the SELECT query in SQLite3. Most of the content of this section will pertain to the inner workings of the SELECT query, and not the Haskell language itself. We're going to have an understanding of the following SELECT clauses: WHERE, ORDER BY, and LIMIT. Each of these have their own utility, but when these clauses work together you can quickly see how data can be sliced into workable chunks that can be studied later. This is my preferred way of working with data where we let SQL do the slicing and Haskell do the dicing. Once we have a data slice that we're happy with, we'll spend some time at the end looking at how to parse that chunk of data into something usable by Haskell.

Let's go back to our Jupyter Notebook; we will flop over to our primary Earthquakes notebook. We will begin with...

Working with SQLite3 and descriptive statistics

So far, we've seen how to generate an SQLite3 database using the sqlite3 command-line utility, and how to interact with that database within Haskell. This section combines the knowledge of descriptive statistics from Chapter 1, Descriptive Statistics, with our database work in Chapter 2, SQLite3. We will be using descriptive statistics with our SQLite3 database in this section. First, we will create our descriptive statistics module from functions found in our Baseball notebook. Second, we will slice up some data using SELECT queries. Third, we will pass data to our descriptive statistics functions, and discuss the results. We'll be looking at earthquakes, specifically in the region of Oklahoma. So, let's glide over to our virtual machine running our IHaskell Notebook, and what I would like to do is to discuss how...

Summary

In this chapter, we installed the sqlite3 command-line utility and installed the necessary SQLite3 libraries for working with data in the IHaskell environment. Most data doesn't come in SQLite3 format, but in CSV format. So, we covered how to convert a CSV file into an SQLite3 table. We explored the versatility of SELECT queries in SQLite3 by means of the WHERE clause, the ORDER BY clause, and the LIMIT clause. We also explored how to create our own custom module of descriptive statistics, and then we used that module in order to study earthquake data in the IHaskell environment. In our next chapter, we're going to take a look at regular expressions in Haskell.

lock icon The rest of the chapter is locked
You have been reading a chapter from
Getting Started with Haskell Data Analysis
Published in: Oct 2018 Publisher: Packt ISBN-13: 9781789802863
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $15.99/month. Cancel anytime}