Building a Web Application with PHP and MariaDB: A Reference Guide

By Sai Srinivas Sriparasa
    What do you get with a Packt Subscription?

  • Instant access to this title and 7,500+ eBooks & Videos
  • Constantly updated with 100+ new titles each month
  • Breadth and depth in over 1,000+ technologies
  1. Free Chapter
    CRUD Operations, Sorting, Filtering, and Joins
About this book

Starting with a quick refresher of the PHP language and MariaDB database, readers will explore concepts such as unit testing, session authentication and management, permissions engine, caching, security, and performance optimization. Building a Web Application with PHP and MariaDB: A Reference Guide begins with basic and advanced programming techniques in both PHP and MariaDB, followed by specialized operations such as working with files and directories. Next, you will be introduced to the concept of REST, and how principles of REST are applied to host XML and JSON feeds for others to consume. This book will show readers how to build a web application that will be an online book store. This would leverage a logging system that keeps a track of the activity that is going on in the application.

This book will help a beginner to learn the basics of object-oriented programming with PHP and help a seasoned expert to understand the intricacies of securing and speeding up their web applications.

Publication date:
June 2014
Publisher
Packt
Pages
200
ISBN
9781783981625

 

Chapter 1. CRUD Operations, Sorting, Filtering, and Joins

Data storage and management have been a very powerful trait for a long time, and as a server-side web developer, it is of paramount importance to have a thorough understanding of the available data storage options. The data that we might be dealing with could be user information, company data, order data, product data, or personal data, and so on. Data in its raw form needs to be processed, cleared, and organized to generate information. Text files and spreadsheets can be used by web applications for storing data but, as the amount of data grows in size, it becomes very hard to store all the data in a single file, as the burgeoning size takes its toll on the speed of retrieval, insertion, and constant updates to the file. Numerous websites store the users' access information in daily or weekly logfiles in the text format, which ends up with a large number of logfiles. The common problem with data storage in this fashion is conserving the data integrity, an example being the process of weeding out duplicate records when data spanned across multiple files becomes cumbersome. A few other problems with data storage in files is the process of managing updates to the file, logging the information about what the updates were or who made them, and applying the necessary file locks when multiple users access and update files at the same time. These are a few reasons why there has always been a need to look for other data storage and management solutions.

An alternate data storage solution, the method that we will rely upon for the most part of this book, is to store the data in a database. A database is an integrated collection of data, and the rules pertaining to that data. A database relies upon a database management system to store the data in an organized manner, to implement the rules that guard the data, and to make the operations such as data retrieval, data modification, and data management simple.

A Database Management System (DBMS) is a software or a collection of programs that manage a single database or multiple databases, and provide critical functionality for data administration, data access, and efficient data security. An example of a database management system is a bookshelf, which is an enclosed space that can be used for storing books in an organized manner. There are multiple vendors who provide different database management systems and we will focus on MariaDB.

Continuing with the bookshelf example, the content of a book is divided into chapters; similarly, the data in a database is stored in tables. A table can be described as the fundamental building block of the database. Data can only be stored inside a table, if there are no tables in the database; the database is devoid of data. Every table is identified by a unique name, meaning that the same database cannot have two tables with the same name. The data in a table is stored and is represented in a two-dimensional format as rows and columns. MariaDB is a RDBMS and follows the theory of relational-models proposed by Edgar F Codd. The term relational is applied in two ways, the first is the relation between one or more tables in the same database and the second is the relationship between the columns within a table.

Tables carry certain characteristics and are built based on a specific structure (or a layout) that defines how the data will be stored. These characteristics are a unique name for the column and the type of data that will be stored in the column. A row would store the smallest unit of information that can be stored in a table and each column in the table will store a piece of relevant data for a single record. We can have a table with all our users' data, a table with all our orders information, and a table with all our product information. Here, each row in the users table would represent a user record, each row in the orders table would represent an order record, and each row in the products table would represent a product record. In the users table, the columns could be username, address, city, state, and zip code; all these columns provide certain data about the user. Each column is associated with a datatype that defines the type of data that can be stored in the column. Datatypes restrict the type of data that can be stored in a column, which allows for a more efficient storage of data. Based on the type of data that is expected to be stored, datatypes can be broadly categorized into numeric, string, and date-time datatypes.

 

String datatypes


Let us look at the following main datatypes:

Datatype

Explanation

Comments

CHAR(L)

This stores a fixed-length string between 0 and 255 bytes.

Trailing spaces are removed.

VARCHAR(L)

This stores a variable-length string between 0 and 65,535 characters.

65,535 is the effective maximum row size for table.

TEXT

This stores character data and the maximum length of a text column is 65,535 characters.

Length need not be specified.

TINYTEXT

This stores the text column with a maximum length of 255 characters.

 

MEDIUMTEXT

This stores the text column with a maximum length of 16,777,215 characters.

 

LONGTEXT

This stores the text column with a maximum length of 4,294,967,295 characters.

 

BLOB

This stores binary data and the maximum length of a text column is 65,535 bytes.

Binary Large Objects are used to store binary data such as images.

TINYBLOB

The BLOB datatype column with a maximum length of 255 bytes.

 

MEDIUMBLOB

This stores the text column with a maximum length of 16,777,215 bytes.

 

LONGBLOB

This stores the text column with a maximum length of 4,294,967,295 bytes.

 

ENUM

This provides a list of strings from which a value can be chosen.

A list of 65,535 values can be inserted into the ENUM datatype.

SET

This is similar to the ENUM datatype. It provides a list of strings from which zero or more values can be chosen.

Can have a maximum of 64 distinct values.

 

Number datatypes


Let us now look at the following main number datatypes:

Datatype

Explanation

Comments

tinyint

This stores integer values.

-128 to 127, Signed

0 to 255, Unsigned

Smallint

This stores integer values.

-32768 to 32767, Signed

0 to 65535, Unsigned

Mediumint

This stores integer values.

-8388608 to 8388607, Signed

0 – 16777215, Unsigned

int(l)

This stores integer values and takes the size of the number.

-2147483648 to 2147483647, Signed

0 – 4294967295, Unsigned

Bigint

This stores integer values.

-9223372036854775808 to 9223372036854775807, Signed

0 to 18446744073709551615, Unsigned

Float(l,d)

This stores floating point numbers and allows us to define the display length (l) and the number of digits after the decimal point (d). The default values for l, d are 10 and 2, respectively.

This uses 4-byte single precision and can display from 0 to 23 digits after the decimal.

Double(l,d)

This is similar to FLOAT, and uses 8-byte double precision. The default values for l, d are 16 and 4, respectively.

The DOUBLE datatype can display from 24 to 53 results. Both the FLOAT and DOUBLE datatypes are commonly used for storing the results from scientific calculations.

decimal(l,d)

This stores the exact numeric data values and allows us to define the display length (l) and the number of digits after decimal point (d).

This is used for precision mathematics that deals with extremely accurate results. The DECIMAL datatype is commonly used to store monetary data.

 

Date datatypes


Let us now look at the following main date datatypes:

Datatype

Explanation

Comments

Date

This stores the date in YYYY-MM-DD format.

The supported range is from 1000-01-01 to 9999-12-31.

Time

This stores the time in HHH:MM:SS format.

The supported range is from -838:59:59 to 838:59:59.

datetime

This stores both the date and time in YYYY-MM-DD HH:MM:SS format.

The supported range is from 1000-01-01 00:00:00 to 9999-12-31 23:59:59.

Timestamp

This stores both the date and time.

The supported range is from 1970-01-01 00:00:01 UTC to 2038-01-19 03:14:07 UTC.

year (L)

This stores the year in either a 2-digit or a 4-digit format. The length of the year can be specified during declaration. The default is a 4-digit year.

The supported range for a 4-digit year is from 1901 to 2155.

Now that we have discussed the available datatypes for building columns, we will use SQL to build our first table. Structured Query Language (SQL) is a multipurpose programming language that allows us to communicate with the database management system to manage and perform operations on the data. SQL operations can be divided into three groups: Data Definition Language (DDL), Data Manipulation Language (DML), and Data Control Language (DCL). These three groups are explained in the following table:

Groups

Explanation

Operations

DDL

Data Definition Language can be used to create a table or alter the structure of a table once it is built, drop the table if it is deemed to be unnecessary, and to perform operations such as truncating the data in a table and creating and dropping indexes on columns.

  • CREATE

  • ALTER

  • DROP

  • TRUNCATE

  • RENAME

DML

Data Manipulation Language is used to perform insert, update, delete, and select operations on the data.

  • SELECT

  • INSERT

  • UPDATE

  • DELETE

  • CALL

  • REPLACE

  • LOAD DATA INFILE

DCL

Data Control Language is used for managing the access to the data. DCL can be used to work with MariaDB's complex security model.

  • GRANT

  • REVOKE

Other administration and utility statements

Other SQL commands that are often used but do not come under DDL, DML, or DCL.

  • EXPLAIN

  • SHOW

  • DESCRIBE

  • HELP

  • USE

Now that we have discussed the basics of Database Management System and SQL, let us connect to our MariaDB server. MariaDB is shipped with a few pre-existing databases that are used by MariaDB itself to store metadata such as information about databases, tables, columns, users, privileges, logs, and so on (yes, MariaDB stores its own data in MariaDB tables).

Tip

For more information about the installation procedures for PHP, MariaDB, and Apache, please refer to the Bonus chapter 1, Installation of PHP, MariaDB, and Apache present online on the Packt Publishing website.

As we have installed MariaDB and have root access to the server, we will be able to view all this metadata information. To retrieve the metadata information that is currently on MariaDB, we will use the SHOW utility command and, as we are interested in retrieving the list of existing databases, we will append DATABASES to our SHOW command:

Note

SQL commands are case-insensitive, so the case of the SQL command does not matter.

Upon executing the show databases; command, the list of existing databases will be outputted to the screen. These databases are reserved to store configurations and necessary metadata (yes, MariaDB stores its data on MariaDB itself), so it is advised to avoid using these databases for storing other data. For storing other data, we will have to create our own database. We will use the SQL commands that are part of DDL to create new databases. For creating a new database, the CREATE DDL command is appended with DATABASE and then the name of the database to be created is added. Let us create a simple course registry database that keeps a track of student records, the available courses, and the courses for which the students have registered.

Note

MariaDB is very particular about statement terminators, a semicolon ; is the default statement terminator and, unless the statement terminator is given, the SQL command is not executed.

We have successfully created our first database. To verify that we have created this database, let us run the show databases; command one more time to see if our new database is reflected in the list of existing databases:

Now that we have verified that our new database is available in the list of existing databases, let us access the database and build tables in the course_registry database. For accessing a database, we will utilize the USE utility command. The USE command has to be followed with the name of an existing database to avoid an error, once this command has been executed.

Now that the database has been successfully changed, note that the database name reflects in between the brackets next to MariaDB, which denotes the current database.

Tip

Another way of finding the current database is to use the select database(); statement and print it out to the console; if the output is null, this means that no database has been selected.

Now that we have chosen the course_registry database, let us take a brief look at the data that has to be housed in this database. The course_registry database keeps a track of student records, the available courses, and the courses for which the students have registered. We could do this by putting the students and the courses that they have registered for in a single table. However, the problems with this approach, similar to a spreadsheet, are twofold. The first problem is that the student information would keep repeating when a student registers for multiple courses, thereby causing unnecessary redundancy.

The second problem will be about data inconsistency, assuming that the student information was wrong. Either we will be using this erroneous information another time, or we might be employing another data entry process that allows the user to enter different data as user information, which causes data inconsistency. To avoid this, we are splitting our data into three tables; they are students, courses, and students_courses.

The student records will be stored in the students table, the data about the available courses will be stored in the courses table, and the data about the courses that the students have registered for will be stored in the students_courses table. The students_courses table will be an association table that contains common fields from the students and the courses tables. This table can also be referred to as a bridge table, paired table, or cross reference table. By using the students_courses table, we can accommodate a common case where one student can register for more than one course.

Before we begin building our tables, it is always important to understand the type data that will be housed in this table and based on the data that will be housed in that table, we will have to decide on the column names and the datatypes for those columns. Column names have to be intuitive in order to help others such as system administrators, auditors, and fellow developers to easily understand the kind of data that can be or is currently being stored in those columns, and the respective datatypes of those columns will explain the type that can be housed in a column. Let us begin with our students table.

 

The students table


Let us take a look at the following fields in the table and what work they perform:

Column name

Datatype

Comments

student_id

Int

This stores the unique identifier for a student

first_name

Varchar(60)

This stores the first name of the student

last_name

Varchar(60)

This stores the last name of the student

address

Varchar(255)

This stores the address of the student

city

Varchar(45)

This stores the name of the city

state

Char(2)

This stores the two letter abbreviation for states in the United States

zip_code

Char(5)

This stores the five digit zip code for an address in the United States

Note

It is advised to use a character datatype for fields such as zip codes or SSNs. Though the data is going to be a number, integer datatypes are notorious for removing preceding zeroes, so if there is a zip code that starts with a zero, such as 06909, of an integer datatype, the zip code would end up in the column as 6909.

Now let us convert this table structure into executable SQL, to create our table, we will be using the CREATE DDL command, followed by TABLE and then append it with the table structure. In SQL, the column description is done by mentioning the column name first and then adding the datatype of the column. The STUDENTS table has multiple columns, and the column information has to be separated by a comma (,).

Now that the query has been executed, the students table has been created. To verify if the students table has been successfully built, and to view a list of existing tables that are in the current database, we can use the SHOW utility command and append that with TABLES:

We have successfully used the show tables; command SQL statement to retrieve a list of existing tables, and have verified that our students table exists in our course_registry database. Now, let us verify if our students table has the same table structure as we originally intended it to have. We will use the DESCRIBE utility command followed by the table name to understand the table structure:

Tip

The DESCRIBE and DESC commands can be used interchangeably, both the commands would need the table name to return their structure.

Now let us move on to the courses table, this table will house all the available courses for which a student can register. The courses table will contain a unique identifier for the course (course_id), the name of the course (course_name), and a brief description of the course (course_description).

The courses table

Let us now look at the fields and the type of values they are storing:

Column name

Datatype

Comments

course_id

int

This stores the unique identifier for a course.

name

varchar(60)

This stores the title of the course.

description

varchar(255)

This stores the description of a course.

Now let us convert this table structure into executable SQL to create our courses table:

Now that the query has been executed, let us run the SHOW TABLES command to verify if the courses table has been created:

The output from the SHOW TABLES command returns the list of current tables, and the courses table is one of them. Now that we have built the students table and the courses table, let us build the bridge table that would hold the association between the two tables. This table would contain the data about the students who were enrolled to a particular course.

The students_courses table

Let us now look at the fields in this table and their respective values:

Column name

Datatype

Comments

course_id

int

This stores the unique identifier for a course

student_id

int

This stores the unique identifier for a student

Now, let us convert this table structure into executable SQL, to create our courses table using the following command:

Now that the query has been executed, let us run the SHOW TABLES command to verify if the courses table has been created:

The output from the SHOW TABLES command returns the list of current tables, and the students_courses table is one of them.

 

Inserting data


Now that we have built our tables, it is time to insert records into the tables. Let us look at a few different methods for inserting a single row of data and inserting multiple rows of data. For insertion of data into a table, we will use the INSERT DML command, and supply the table name and the values for the available columns in the table. Let us begin by inserting student records into the students table:

In this example, we insert a new student record into the students table; we are supplying the data for that student record in the VALUES clause. This syntax, though it appears to be very simple, it is not a very safe method of inserting data. This INSERT statement is depending upon the order in which the columns were defined in the table structure, so the data in the VALUES clause will be mapped by position, 1 would go into the first column in the table, though it is intended to go into the student_id column. If the students table is rebuilt locally or on a different machine, there is no guarantee that the column order would remain the same as the order on the current MariaDB database server. The other approach that is considered safer when compared to this one is the INSERT statement, where the column names are explicitly mentioned in the SQL:

Though this might be a bit longer, this would guarantee that data that is being passed in via the VALUES clause is going into the right column. By using this INSERT syntax, the order in which the columns are mentioned is no longer important. When this query is executed, MariaDB matches each item in the columns list with its respective value in the VALUES list by position. This syntax can also be used for the case where the data is only available for a few columns. Let us come up with an INSERT statement that has data for a few columns and uses NULL for a column that does not have any data:

Note

In SQL, the term NULL is used to denote that a value does not exist.

In this example, we are inserting a student record whose address is not known, so we are using NULL to populate the column.

Note

Columns by default allow NULL values to be populated, unless it is explicitly mentioned not to allow NULL values.

Now that we have seen the different insertion syntaxes for inserting a single record row, let us take a step forward and look at how multiple records can be inserted. There are two ways of inserting multiple records into a table, the first method is where INSERT statements are created for each row, and are separated by the statement terminator (;):

The other way of inserting multiple records is by using a single VALUES clause while passing in multiple records, separating each record with a comma (,), and adding a statement terminator at the end of the last record:

Tip

We are currently not using any constraints to maintain any referential integrity among tables, so any integers can be inserted into the students_courses table. To allow only existing student IDs and course IDs to be inserted, we will have to use the primary key and foreign key constraints. We will be covering constraints in the next chapter.

In this example, we are inserting multiple records into the students_courses table. On execution of this SQL query, the first statement inserts an associative record into the students_courses table and the value for the column student_id is 1, which maps back to the student record of John Doe, and the value for course_id is 1 that corresponds to the course record CS-101. The inline comments at the end of each statement are used to describe the data that is being inserted via this statement. Though these comments are added to the INSERT statements, they are only intended to explain the purpose of the statements and will not be processed by MariaDB.

Note

MariaDB also supports multi-line comments. Syntax for creating multi-line comments is by using /* to start the comment and ending the comment with */.

/* multiple line

comments

go

here*/

The last method of insertion that we are skipping for now is to insert the data that has been retrieved on the fly from a table. We will be looking at that once we have covered the methods for retrieving data and filtering data.

 

Retrieving data


Now that we have inserted data into the students, courses, and students_courses tables, let us look at the different mechanisms of retrieving data, we will be using the SELECT command to retrieve the data. The SELECT statement would expect two things as a minimum, the first would be what to retrieve and the second would be where to retrieve it. The simplest SELECT command would be to retrieve all the student records from the students table:

In this query, we are using * to retrieve the data for all the columns from the students table, this is not a preferred method of retrieving data. The preferred method for data retrieval is by mentioning the individual columns separated by a comma (,) after the SELECT clause:

In this query, we are selecting the student_id, first_name, and last_name columns from the students table. As we are not filtering the data yet, SELECT statements would return every student record that is in the students table. We can use the LIMIT clause to retrieve a certain number of records:

In this query, we are retrieving the data from the students table and we are retrieving the student_id, first_name, and last_name columns; however, rather than retrieving all the rows, we are only retrieving a single row. To retrieve the next row, we could still use the limit, but we would use LIMIT clause accompanied by the OFFSET clause. The OFFSET clause determines the starting point as to where the records should start from, while the LIMIT clause determines the number of records that would be retrieved.

 

Sorting data


Now that we have looked at different techniques of retrieving the data, let us look at how the data can be represented in a more ordered way. When we execute a SELECT statement, the data is retrieved in the order in which it exists in the database. This would be the order in which the data is stored; therefore, it is not a good idea to depend upon MariaDB's default sorting. MariaDB provides an explicit mechanism for sorting data; we can use the ORDER BY clause with the SELECT statement and sort the data as needed. To understand how sorting can be of help, let us begin by querying the students table and only retrieving the first_name column:

In the first example, we are going by MariaDB's default sort, and this would give us the data that is being returned based on the order of the insert:

In this example, we are ordering the data based on the first_name column. The ORDER BY clause by default sorts in ascending order, so the data would be sorted in an ascending alphabetical order and if the first character of one or more strings is the same, then the data is sorted by the second character, which is why Jane comes before John. To explicitly mention the sort order as ascending, we can use the keyword asc after the column name:

In this example, we are again ordering the data based on the first_name column and the ORDER BY clause has been supplied with desc, we are setting the sort direction to descending, which denotes that the data has been sorted in a descending order. MariaDB also provides a multi-column sort, which is a sort within a sort. To perform a multi-column sort, we would specify the column names after the ORDER BY clause separated by comma (,). The way the multi-column works is, the data would be first sorted by the first column that is mentioned in the ORDER BY clause, and then the dataset that has already been sorted by the first column is again sorted by the next column and the data is returned back. As a muti-column sort performs sorting on multiple levels, the order of columns will determine the way the data is ordered. To perform this example, let us insert another row with the student name John Dane and the student ID being 4, the reason for using John Dane is to make sure that there are more than one students that share the first name of John (John Doe and John Dane) and the last name of Dane (Jane Dane and John Dane) exclusively:

In this example, we are retrieving the last_name and first_name columns from the students table and are first ordering the data by "last_name" and then reordering the previously ordered dataset by first_name. We are not restricted by the ORDER BY clause to use only the columns being used for the sort. This will only help us sort the data in the correct direction.

 

Filtering data


Until now, we have dealt with data retrieval where all the data in the students table is being retrieved, but seldom do we need all that data. We have used the LIMIT and OFFSET clauses that have allowed us to limit the amount of data were retrieved. Now let us use MariaDB's filtering mechanism to retrieve the data by supplying search criteria. To perform a search in a SQL statement, we will use the WHERE clause. The WHERE clause can be used with the SELECT statement, or it can be even used with the UPDATE and DELETE statements, which will be discussed in the next section:

In the preceding example, we are selecting the students' records whose last_name is Dane.

In the preceding example, we are selecting the students' records whose student_id is 1.

In the preceding example, we are selecting the students' records whose student_id is greater than 1.

In the preceding example, we are selecting the students' records whose student_id is less than 4.

In the preceding example, we are selecting the students' records whose student_id is between 1 and 4, the between clause is inclusive, so the records with student_id 1 and 4 are also retrieved. The following table lists the common operators that can be used for data filtering:

Operator

Explanation

Comment

=

Filters and returns data where the criterion has an exact match.

 

!=

Filters and returns data where the criterion doesn't have an exact match.

 

<>

Filters and returns data where the criterion doesn't have an exact match.

This is same as above, based on preference, either notations can be used for inequality.

>

Filters and returns data where the data is greater than the value in the criterion.

 

>=

Filters and returns data where the data is greater than or equal to the value in the criterion.

 

<

Filters and returns data where the data is lesser than the value in the criterion.

 

<=

Filters and returns data where the data is lesser than or equal to the criterion.

 

IS NULL

Filters and returns the rows where the specified column has no data.

 

IS NOT NULL

Filters and returns the rows where the specified column has some data.

 

BETWEEN

Filters and returns data where the data is part of the specified range.

This uses the keywords BETWEEN, and AND.

Data can also be filtered by utilizing multiple search criteria by using the AND and OR operators, by employing multiple column search criteria, by using wildcard filtering, by using the IN operator, and so on. As this chapter will only deal with basic filtering, we will not be covering these advanced filtering concepts. The basic filtering in this chapter can be used as a foundation to delve deeper into understanding the advanced concepts of filtering.

 

Updating data


Until now, we have worked with the creation of databases, tables, data, and retrieval of data. Now let us go over the process of updating data, once the data has been added to the table, there will be different cases where the data has to be updated, such as a typo while adding the student's name, or if the student's address changes after they have registered for the course, and so on. We will use the UPDATE DML statement to modify the data. The UPDATE statement requires a minimum of three details, the first is the name of the table on which this operation will be performed, the second is the name of the column, and the third is the value that the column to has to be assigned to. We can also use the UPDATE statement to modify more than one column at a time. There are two cases where the UPDATE statement can be used. The first case is where all the records in the table will be updated, and this has to be done very carefully as this could cause the loss of existing data. The second scenario when using the UPDATE statement is in combination with the WHERE clause. By using the WHERE clause, we are targeting a very specific set of records based on the filter criteria.

Tip

It is recommended to execute the filter criteria with a SELECT statement, so that we can verify the dataset on which our UPDATE statement would run, in order to make any required changes if the filter criterion does not reflect the expected results. Another way of handling such scenarios is to use a transaction, which will allow us to rollback any changes that we have made.

In the preceding example, we have updated John Dane's current city to Nebraska by using his student ID. We can also verify this by looking at the output on the query console, it returns that the filter criterion was matched for one row, and the update statement was applied for that one row.

 

Deleting data


We will use the DELETE DML statement for deletion of data. The DELETE statement at a minimum expects the name of the table. Similar to the UPDATE statement, it is recommended that the DELETE statement is always used with filter criteria to avoid loss of data.

The DELETE statement should be used when a record has to be permanently removed from the table.

Note

To avoid permanent loss or deletion of data Boolean flags are used to determine if a record is active or inactive (1 or 0). These are called soft deletes and help us retain data in the long run.

In the preceding example, we are deleting the records from the students table that match the criterion of student_id equal to 4. As there is only one record that matches that criterion, that record has been deleted. The recommendations that were made above about how to use the filter criterion apply for the DELETE statement too.

 

Joins


Until now, we have coupled our SELECT statements with various filtering and sorting techniques to query the student information extensively. As we are operating in a relational-model of data storage and since our data is stored in different tables, we are yet to figure out how our SELECT statements can be fired across multiple tables. In our case, this would help us find out what course or courses a student has registered for, or to find our which course has the most number of students. Following the relational-model of data allows us to store data in a more efficient manner, allows us to independently manipulate the data in different tables, and allows for greater scalability; however, querying the data across multiple tables is going to be difficult when compared to retrieving records from a single table. We will use JOINS to associate multiple tables, to retrieve, update, or delete data.

A SQL JOIN is a virtual entity and is performed at run time, during the execution of the SQL statement. Similar to any other SQL statement, the data would only be available during the query execution and is not implicitly persisted to the disk. A SQL JOIN can be coupled with a SELECT statement to retrieve data from multiple tables. Let us go through the most common JOIN: the INNER JOIN, a join based on the equality comparison on the join-predicate.

Let us look at a few examples that perform SQL INNER JOIN between two or more tables:

In the preceding example, we are joining the students and students_courses tables to retrieve a list of all the students who have registered for a course. This is similar to the SELECT statements that we worked with earlier; a big difference is that we can now add a column that is part of a different table. We use the INNER JOIN clause to build the association between students and the students_courses table where the values for student_id in the students table exist in the students_courses table; this is referred to as the join-predicate. Now let us join all the tables and retrieve the names of the courses for which each student has registered.

Note

In this example, we are creating an alias name for the name column in the courses table. We are using the AS statement to explicitly create a temporary alias to make the column name more intuitive. We can build aliases for tables in a similar manner

In the previous example, we have joined the three tables that are available in our course_registry database and are now able to retrieve the list of courses for which the students have registered. Similar to our previous SELECT statements, let us add a filter criterion to narrow down our search:

In the preceding example, we are filtering the data by student_id and are searching for records with student_id equal to 2. We have discussed the most commonly used form JOIN statement, which is the INNER JOIN or the equi-join. There are other types of JOIN in SQL that are supported by MariaDB such as OUTER JOIN, SELF JOIN, and NATURAL JOIN, we will be skipping these JOIN statements.

 

Summary


In this chapter, we have covered the basics of relational database management systems with MariaDB. We began by building our first database, and performed Create, Read, Update, and Delete (CRUD) operations. We used the SQL SELECT statement to retrieve data and used the ORDER BY and WHERE statements to sort and filter the data respectively. Later, we moved on to use the UPDATE and DELETE statements to modify and remove data respectively. Finally, we used the INNER JOIN to retrieve data from multiple tables and coupled that with the WHERE statement to filter that data.

In the next chapter, we will be going over more advanced topics such as creating calculated fields and building complex views, stored procedures, functions, and triggers.

About the Author
  • Sai Srinivas Sriparasa

    Sai Srinivas Sriparasa is a web developer and an open source evangelist living in the Stamford area. Sai was the lead developer involved in building the Dr. Ozs website and has led teams for companies such as Sprint Nextel, West Interactive, and Apple. Sai's repertoire includes JavaScript, PHP, Python, HTML5, responsive web development, ASP.NET, C#, and Silverlight.

    Browse publications by this author
Latest Reviews (4 reviews total)
I had no idea I had purchased an ebbok.
Good overview of the process to start work from.
Must be a mandatory reference for any MariaDB+PHP programmer
Building a Web Application with PHP and MariaDB: A Reference Guide
Unlock this book and the full library FREE for 7 days
Start now