You're reading from The Applied SQL Data Analytics Workshop - Second Edition

Product typeBook

Published inFeb 2020

Reading LevelIntermediate

PublisherPackt

ISBN-139781800203679

Edition2nd Edition

Languages

SQL

Tools

Anaconda PostgreSQL

Concepts

Data Analysis

Authors (3):

Matt Goldwasser

Upom Malik

Benjamin Johnston

View More author details

Relational Databases and SQL

A relational database is a database that utilizes the relational model of data. The relational model, invented by Edgar F. Codd in 1970, organizes data as relations, or sets of tuples. Each tuple consists of a series of attributes that generally describe the tuple. For example, we could imagine a customer relationship where each tuple represents a customer. Each tuple would then have attributes describing a single customer, giving information such as the last name, first name, and age, perhaps in the format (Smith, John, 27). One or more of the attributes is used to uniquely identify a tuple in a relation and is called the relational key. The relational model then allows logical operations to be performed between relations.

In a relational database, relations are usually implemented as tables, as in an Excel spreadsheet. Each row of the table is a tuple, and the attributes are represented as columns of the table. While not technically required, most tables in a relational database have a column referred to as the primary key, which uniquely identifies a row of the database. Every column also has a data type, which describes the type of data in the column. Tables are then usually collected together in common collections in databases called schemas. These tables usually are loaded with processes known as Extract, Transform, Load jobs (or ETL for short).

Tables are usually referred to in queries in the format [schema].[table]. For example, a products table in the analytics schema would be generally referred to as analytics.product. However, there is also a special schema called the public schema. This is a default schema where, if you do not explicitly mention a schema, then the database uses the public schema; for example, the public.products table and the products table are the same.

The software used to manage relational databases on a computer is referred to as a relational database management system (RDBMS). SQL is the language utilized by users of an RDBMS to access and interact with a relational database.

Note

Virtually all relational databases that use SQL deviate from the relational model in some basic way. For example, not every table has a specified relational key. Additionally, a relational model does not technically allow for duplicate rows, but you can have duplicate rows in a relational database. These differences are minor and will not matter for the vast majority of readers of this book.

Advantages and Disadvantages of SQL Databases

Since the release of Oracle Database in 1979, SQL has become an industry standard for data in nearly all computer applications—and for good reason. SQL databases provide a ton of advantages that make it the de facto choice for many applications:

Intuitive: Relations represented as tables are a common data structure that almost everyone understands. As such, working with and reasoning about relational databases is much easier than doing so with other models.
Efficient: Using a technique known as normalization, relational databases allow the representation of data without unnecessarily repeating it. As such, relational databases can represent large amounts of information while utilizing less space. This reduced storage footprint also allows the database to reduce operation costs, making well-designed relational databases quick to process.
Declarative: SQL is a declarative language, meaning that when you write code, you only need to tell the computer what data you want, and the RDBMS takes care of determining how to execute the SQL code. You never have to worry about telling the computer how to access and pull data in the table.
Robust: Most popular SQL databases have a property known as atomicity, consistency, isolation, and durability (ACID) compliance, which guarantees the validity of the data, even if the hardware fails.

That said, there are still some downsides to SQL databases, which are as follows:

Relatively Lower Specificity: While SQL is declarative, its functionality can often be limited to what has already been programmed into it. Although most popular RDBMS software is updated constantly with new functionality being built all the time, it can be difficult to process and work with data structures and algorithms that are not programmed into an RBDMS.
Limited Scalability: SQL databases are incredibly robust, but this robustness comes at a cost. As the amount of information, you have doubles, the cost of resources more than doubles. When very large volumes of information are involved, other data stores such as NoSQL databases may actually be better.
Object-Relation Mismatch Impedance: While tables are a very intuitive data structure, they are not necessarily the best format for representing objects in a computer. This primarily occurs because objects often have attributes that have many-to-many relationships. For instance, a customer for a company may own multiple products, but each product may have multiple customers. For an object in a computer, we could easily represent this as a list attribute under the customer object. However, in a normalized database, a customer's products would potentially have to be represented using three different tables, each of which must be updated for every new purchase, recall, and return.

The rest of the page is locked

You have been reading a chapter from

The Applied SQL Data Analytics Workshop - Second Edition

Published in: Feb 2020Publisher: PacktISBN-13: 9781800203679

Authors (3)

Matt Goldwasser

Matt Goldwasser is the Head of Applied Data Science at the T. Rowe Price NYC Technology Development Center. Prior to his current role, Matt was a data science manager at OnDeck, and prior to that, he was an analyst at Millennium Management. Matt holds a bachelor of science in mechanical and aerospace engineering from Cornell University.
Read more about Matt Goldwasser

Upom Malik

Upom Malik is a data science and analytics leader who has worked in the technology industry for over 8 years. He has a master's degree in chemical engineering from Cornell University and a bachelor's degree in biochemistry from Duke University. As a data scientist, Upom has overseen efforts across machine learning, experimentation, and analytics at various companies across the United States. He uses SQL and other tools to solve interesting challenges in finance, energy, and consumer technology. Outside of work, he likes to read, hike the trails of the Northeastern United States, and savor ramen bowls from around the world.
Read more about Upom Malik

Benjamin Johnston

Benjamin Johnston is a senior data scientist for one of the world's leading data-driven MedTech companies and is involved in the development of innovative digital solutions throughout the entire product development pathway, from problem definition to solution research and development, through to final deployment. He is currently completing his Ph.D. in machine learning, specializing in image processing and deep convolutional neural networks. He has more than 10 years of experience in medical device design and development, working in a variety of technical roles, and holds first-class honors bachelor's degrees in both engineering and medical science from the University of Sydney, Australia.
Read more about Benjamin Johnston

Personalised recommendations for you

Based on your interests and search pattern

Et al.

Ever wonder why speech recognition systems don't understand the Scottish accent, or what would happen if an astronaut only ate mac 'n' cheese, or other spurious reflections you'd have at a bar? We did, then collated those deliberations into absurd research articles with fake figures and methodologies inspired by even more fictionally absurd studies.

BookAug 2023230 pages5

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages4

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages5

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages1

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages5

Mastering Tableau 2023

This book is a comprehensive resource to mastering your Tableau skills and becoming a BI expert. As you progress, you will learn how to build advanced dashboards and improve your storytelling to derive key business insight, as well as make you well-versed with advanced functionalities of Tableau in the business intelligence domain.

BookAug 2023684 pages

Building AI Applications with ChatGPT APIs

This guide covers all ChatGPT API features for effortless creation of robust AI powered apps. With its help, you’ll be able to leverage ChatGPT’s cutting-edge NLP models to take your app development skills to the next level. You’ll also work on ten exciting projects that will give you the practical know-how that you can apply to your existing applications.

BookSep 2023258 pages5

Building AI Applications with ChatGPT APIs

This guide covers all ChatGPT API features for effortless creation of robust AI powered apps. With its help, you’ll be able to leverage ChatGPT’s cutting-edge NLP models to take your app development skills to the next level. You’ll also work on ten exciting projects that will give you the practical know-how that you can apply to your existing applications.

BookSep 2023258 pages2

Data Engineering with AWS

Embark on a journey to master data engineering pipelines on AWS! Our book offers a hands-on experience of AWS services for ingesting, transforming, and consuming data. Whether you're an absolute beginner or someone with basic data engineering experience, this guide is an indispensable resource.

BookOct 2023636 pages5

Modern Data Architecture on AWS

Every organization wants an agile, performant, and cost-effective data platform that meets all their current and future business needs. Purpose-built AWS analytics services and their features play a big part in building such a modern data platform. This book brings to you all the design and architectural patterns that’ll help you achieve this goal.

BookAug 2023420 pages5

Practical Guide to Applied Conformal Prediction in Python

Discover the power of Conformal Prediction with the "Practical Guide to Applied Conformal Prediction in Python." Master the latest techniques to quantify uncertainty in machine learning and computer vision models, and seamlessly apply them to your industry applications.

BookDec 2023240 pages

TinyML Cookbook

With over 70 project-based recipes, the TinyML Cookbook is a practical guide that will help you to get the most out of your microcontrollers. It provides a comprehensive understanding of the theoretical foundations while giving you hands-on experience training ML models for deployment on Arduino Nano 33 BLE Sense, Raspberry Pi Pico, and SparkFun RedBoard Artemis Nano microcontrollers.

BookNov 2023664 pages

You're reading from The Applied SQL Data Analytics Workshop - Second Edition

Relational Databases and SQL

Advantages and Disadvantages of SQL Databases

Unlock this book and the full library FREE for 7 days

Authors (3)

Et al.

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

Mastering Tableau 2023

Building AI Applications with ChatGPT APIs

Building AI Applications with ChatGPT APIs

Data Engineering with AWS

Embark on a journey to master data engineering pipelines on AWS! Our book offers a hands-on experience of AWS services for ingesting, transforming, and consuming data. Whether you're an absolute beginner or someone with basic data engineering experience, this guide is an indispensable resource.

Modern Data Architecture on AWS

Practical Guide to Applied Conformal Prediction in Python

Discover the power of Conformal Prediction with the "Practical Guide to Applied Conformal Prediction in Python." Master the latest techniques to quantify uncertainty in machine learning and computer vision models, and seamlessly apply them to your industry applications.

TinyML Cookbook