Reader small image

You're reading from  Advanced MySQL 8

Product typeBook
Published inJan 2019
Reading LevelIntermediate
Publisher
ISBN-139781788834445
Edition1st Edition
Languages
Tools
Right arrow
Authors (3):
Eric Vanier
Eric Vanier
author image
Eric Vanier

Eric Vanier is one of the few DBAs worldwide specializing in MySQL performance, and is an expert with large architectures such as those for Air Miles, Cogeco, McGill University, Bombardier Aerospace, and many more. As a senior DBA and instructor for 18 years, he has a track record in performance and troubleshooting on dozens of high-profile MySQL systems, including complex cluster environments.
Read more about Eric Vanier

Birju Shah
Birju Shah
author image
Birju Shah

Birju Shah is the principal architect for Endurance International Group. He has a bachelor's degree in computer engineering from Gujarat University. He has the experience and expertise to build scalable products for hosting domains. He is passionate about the latest architectural patterns, tools, and technologies, and he helps organizations to follow best practices. He is also passionate about technical training and sessions.
Read more about Birju Shah

Tejaswi Malepati
Tejaswi Malepati
author image
Tejaswi Malepati

Tejaswi Malepati is the Cassandra Tech Lead for Target. He has been instrumental in designing and building custom Cassandra integrations, including a web-based SQL interface and data validation frameworks between Oracle and Cassandra. Tejaswi earned a master's degree in computer science from the University of New Mexico, and a bachelor's degree in electronics and communication from Jawaharlal Nehru Technological University in India. He is passionate about identifying and analyzing data patterns in datasets using R, Python, Spark, Cassandra, and MySQL.
Read more about Tejaswi Malepati

View More author details
Right arrow

Advanced Data Techniques for Large Queries

In the previous chapters, we learned how to use indexes, and execution plans, and looked at the best way to analyze a table structure. In this chapter, we will be learning how to analyze and optimize large MySQL queries, and look at some tips and techniques.

We all know that a full scan on a huge table will impact your database performance, and we will learn how to avoid it. We will also review the new MySQL 8.0 partition features and improvements.

This chapter will cover the following:

  • The most important variables are full scan indicators
  • Partitioning a table:
    • Overview of partitioning in MySQL 8.0
    • Partitioning types available
    • Horizontally partitioning your data
    • Managing partitions:
      • Option #1: RANGE partitioning
      • Option #2: LIST partitioning
      • Option #3: HASH partitioning
      • Option #4: KEY partitioning
  • Using partitions:
    • Partition pruning...

The most important variables are full-scan indicators

The high season is coming for your business, and you wonder how you will manage the load on your MySQL database.

Stress tests can help you, but it's not a good idea to run them in a production environment. In this case, we can use two important variables that can come to your aid and are called Select_scan and Select_full_join. Of course, other MySQL counters could also give you an idea of the number of queries that will impact MySQL performance, which could lead to performance degradation as the load or pressure increases on your database.

The Select_scan variable in the MySQL SHOW GLOBAL STATUS report shows the number of full scans performed since the last MySQL restart (because every time you restart MySQL, all variables are reset to 0).

The Select_full_join variable is another important indicator that we will look...

Partitioning a table

First, let's define what partitioning a table is. Partitioning a table in MySQL is how MySQL divides its actual data into separate tables, but it is always treated as a single table by the SQL layer.

An overview of partitioning in MySQL 8.0

When partitioning, it is critical to find a partition key. It is important to make sure that the searches we are going to do in a table go to the correct partition or group of partitions. This means that all commands, for example SELECT, UPDATE, DELETE, must include this column in the WHERE clause to ensure efficiency in the use of these partitions.

Generally, the best practice is to add the partition key to the primary key with auto incrementation enabled, that...

Managing partitions

In this section, we will cover options for managing partitions.

Some options that we will look into are the following:

  • RANGE partitioning
  • LIST partitioning
  • HASH partitioning
  • KEY partitioning

RANGE partitioning

Partitioning of this type has an impact on partition records based on the values of columns within a given scope.

The RANGE partition is organized from an interval between two values as long as these values do not overlap, and are set using the VALUES LESS THAN operator.

The basic syntax for creating a partition with RANGE is as follows:

In our following examples, suppose you create a table to store employee records for a pizza chain of 30 restaurants, numbered from 1 to 30.

This table can...

Using partitions

One of the most common reasons for using table partitioning is to separate data by date. Some RDBMSes support explicit date partitioning, which MySQL does not have in 8.0. However, it is very easy in MySQL to create partitioning based on DATE, TIME, or DATETIME columns.

Partition pruning

One of the best-known optimizations is partition pruning. The concept is relatively simple and is based on the principle: "Do not analyze partitions where there can be no corresponding values". Let's assume you have a partitioned called employees, table defined by the following statement:

Consider an example where you want to get the results of a SELECT command such as the following:

SELECT firstname, lastname...

Getting rid of unused and duplicate indexes

We all know now that indexes can mean the difference between a high-performance database and a slow query. Over time, indexes will need occasional maintenance to see whether you have indexes that are no longer useful or are duplicates.

MySQL 5.7.7, introduced a new schema called SYS. MySQL uses a set of objects found in this new schema to assist DBAs and developers in reading collected data from the performance schema. The objects that you'll find in the SYS schema can be used for typical cases of optimization and diagnostics. The objects in this scheme include the following:

  • VIEWS that summarize schema performance data in a more convenient and easier form
  • Stored procedures and stored functions that execute or query operations, such as configuring the performance schema and generating diagnostic reports, that will help you perform...

The most important query optimizations

Queries statements are used to retrieve data from a database. We can get the same results by writing different SQL queries for different needs. But using the optimal query is important when performance is considered for your business. So, you need to SQL query tuning based on this requirement. In this section, we will cover the most important queries importation with WHERE, GROUP BY, and ORDER BY.

Optimizing a query with the WHERE clause

Before we start looking at how to optimize a query with the WHERE clause, you need to know the optimization recipe that you have to apply to be efficient.

Here is an optimization recipe:

  1. Identify the query that is slow
  2. Run an EXPLAIN plan on this slow...

Temporary tables

In this section, we will talk about temporary tables. We will answer the question "Why does MySQL need to create temporary tables with GROUP BY, RANGE, ORDER BY, and other expressions?"

When using the EXPLAIN plan, you will notice that MySQL can create temporary tables when the query uses the following:

  • GROUP BY
  • RANGE
  • ORDER BY
  • Other expressions

In the case of the creation of a temporary table, MySQL will always try to create a temporary table in memory at its the first attempt. Then, if MySQL cannot create this temporary table in memory, it will create it on disk, which is not desirable as far as performance is concerned.

MySQL uses the following variables in the configuration:

  • tmp_table_size: This determines the maximum size for temporary tables in memory
  • max_heap_table_size: This sets the maximum size for MEMORY tables

You can optimize MySQL to...

Case study 1 – an example of how to optimize a complex query

We will explore in this case study a complex step-by-step query such as the following:

  1. How to format a complex query
  2. Run and analyze an execution plan
  3. Optimize a complex query

When we have a query that runs in production and is complex, we need to format it so that we can have more clarity and can easily optimize it.

Here is our request (not formatted):

Here is our formatted query:

Here is the EXPLAIN plan:

Let's analyze the query together.

The first problem observed from the output of the execution plan is the following:

SUBQUERY DEPENDENT (NOT EXISTS ( SELECT 1 FROM employees_party WHERE fk_employees_id = employees.id AND Important_id BETWEEN 1 and 4 AND diff_value BETWEEN 1 and 3));

The solution is to rewrite the part of the SUBQUERY DEPENDENT as follows, in order to optimize our query:

SELECT E....

Case study 2 – how to optimize sort indexes

Did you know that performance problems are often due in particular to MySQL's ORDER BY and LIMIT?

Here's what you need to know about optimizing MySQL's ORDER BY and LIMIT to avoid problems.

MySQL's ORDER BY and LIMIT constitute the most common use of ORDER BY in web and enterprise applications, with large datasets that are sorted.

For example, on many websites, you will sort top tags, users who have recently subscribed, and so on, which requires using ORDER BY and LIMIT in the background on the server.

This type of ORDER BY generally does something like the following:

SELECT ... .. WHERE [conditions] ORDER BY [sort] LIMIT N1, M1

As I often recommend, you need to make sure that your queries use indexes. It is important to have an ORDER BY with LIMIT executed, without having to use a full scan and just as a full...

Tips and techniques

The tips and techniques which I am sharing with you come from two decades of improving and optimizing MySQL performance in thousands of situations. These tips and techniques can help you better understand how to make your MySQL servers more efficient.

Partitions

Here are some tips:

  • It is not recommended to use PARTITIONS until you know how it will help you
  • Do not use PARTITIONS unless you have more than one million records
  • Do not use more than 50 PARTITIONS on a table; otherwise, the performance will be impacted
  • The PARTITION RANGE mode is the only method that is useful and perfect for your needs in general; it is also easy to manage
  • SUB-PARTITIONS are not necessarily useful, except in specific cases
  • The...

Summary

In this chapter, I tried to demonstrate that partitioning is a solution that can be used when you have exhausted all other alternatives, such as optimizing your queries, configuration, and so on. That is applicable when you have several millions of records.

We have covered how to clean your indexes, and of course how to keep the necessary number of indexes needed for your application.

We've looked at how to optimize complex queries, using the proposed optimization recipe that we've discussed throughout.

We also looked at what was given to temporary tables, why these tables appear, and how they can be avoided.

Finally, we saw why we had filesort with JOINS, GROUP BY, and ORDER BY, and how we can improve this situation.

In the next chapter, we will cover advanced techniques for MySQL server settings and MySQL's data dictionary, in MySQL 8.0. We will also look...

lock icon
The rest of the chapter is locked
You have been reading a chapter from
Advanced MySQL 8
Published in: Jan 2019Publisher: ISBN-13: 9781788834445
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
undefined
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $15.99/month. Cancel anytime

Authors (3)

author image
Eric Vanier

Eric Vanier is one of the few DBAs worldwide specializing in MySQL performance, and is an expert with large architectures such as those for Air Miles, Cogeco, McGill University, Bombardier Aerospace, and many more. As a senior DBA and instructor for 18 years, he has a track record in performance and troubleshooting on dozens of high-profile MySQL systems, including complex cluster environments.
Read more about Eric Vanier

author image
Birju Shah

Birju Shah is the principal architect for Endurance International Group. He has a bachelor's degree in computer engineering from Gujarat University. He has the experience and expertise to build scalable products for hosting domains. He is passionate about the latest architectural patterns, tools, and technologies, and he helps organizations to follow best practices. He is also passionate about technical training and sessions.
Read more about Birju Shah

author image
Tejaswi Malepati

Tejaswi Malepati is the Cassandra Tech Lead for Target. He has been instrumental in designing and building custom Cassandra integrations, including a web-based SQL interface and data validation frameworks between Oracle and Cassandra. Tejaswi earned a master's degree in computer science from the University of New Mexico, and a bachelor's degree in electronics and communication from Jawaharlal Nehru Technological University in India. He is passionate about identifying and analyzing data patterns in datasets using R, Python, Spark, Cassandra, and MySQL.
Read more about Tejaswi Malepati