Getting Started with InnoDB

Instant InnoDB


February 2013

$10.00

A concise and expert guide, designed from the ground-up to guide you through the setup of InnoDB, and help you start unlocking the engine's potential

(For more resources related to this topic, see here.)

Basic features of InnoDB

InnoDB is more than a fast disk-based relational database engine. It offers, at its core, the following features that separate it from other disk-based engines:

  • MVCC

  • ACID compliance

  • Transaction support

  • Row-level locking

These features are responsible for providing what is known as Referential integrity; a core requirement for enterprise database applications.

Referential integrity

Referential integrity can be best thought of as the ability for the database application to store relational data in multiple tables with consistency. If a database lacks consistency between relational data, the data cannot be relied upon for applications. If, for example, an application stores financial transactions where monetary data is processed, referential integrity and consistency of transactional data is a key component. Financial data is not the only case where this is an important feature, as many applications store and process sensitive data that must be consistent

Multiversion concurrency control

A vital component is Multiversion concurrency control (MVCC), which is a control process used by databases to ensure that multiple concurrent connections can see and access consistent states of data over time. A common scenario relying on MVCC can be thought of as follows: data exists in a table and an application connection accesses that data, then a second connection accesses the same original data set while the first connection is making changes to it; since the first connection has not finalized its changes and committed its information we don't want the second connection to see the nonfinalized data. Thus two versions of the data exist at the same time—multiple versions—to allow the database to control the concurrent state of the data. MVCC also provides for the existence of point-in-time consistent views, where multiple versions of data are kept and are available for access based on their point-in-time existence.

Transaction isolation

Transaction support at the database level refers to the ability for units of work to be processed in separate units of execution from others. This isolation of data execution allows each database connection to manipulate, read, and write information at the same time without conflicting with each other. Transactions allow connections to operate on data on an all-or-nothing operation, so that if the transaction completes successfully it will be written to disk and recorded for upcoming transactions to then operate on. However, if the sequence of changes to the data in the transaction process do not complete then they can be rolled back, and no changes will be recorded to disk. This allows sequences of execution that contain multiple steps to fully succeed only if all of the changes complete, and to roll back any changed data to its original state if one or more of the sequence of changes in the transaction fail. This feature guarantees that the data remains consistent and referentially safe.

ACID compliance

An integral part of InnoDB is its ability to ensure that data is atomic, consistent, isolated, and durable; these features make up components of ACID compliance. Simply put, atomicity requires that if a transaction fails then the changes are rolled back and not committed. Consistency requires that each successfully executed transaction will move the database ahead in time from one state to the next in a consistent manner without errors or data integrity issues. Isolation defines that each transaction will see separate sets of data in time and not conflict with other transactional data access. Finally, the durability clause ensures that any data that has been committed in a successful transaction will be written to disk in its final state, without the risk of data loss from errors or system failure, and will then be available to transactions that come in the future.

Locking characteristics

Finally, InnoDB differs from other on-disk storage engines in that it offers row-level locking. This primarily differs, in the MySQL world, with the MyISAM storage engine which features table-level locking. Locking refers to an internal operation of the database that prohibits reading or writing of table data by connections if another is currently using that data. This prevents concurrent connections from causing data corruption or forcing data invalidation when data is in use. The primary difference between table- and row-level locking is that when a connection requests data from a table it can either lock the row of data being accessed or the whole table of data being accessed. For performance and concurrency benefits, row-level locking excels.

System requirements and supported platforms

InnoDB can be used on all platforms on which MySQL can be installed. These include:

  • Linux: RPM, Deb, Tar

  • BSDs: FreeBSD, OpenBSD, NetBSD

  • Solaris and OpenSolaris / Illumos: SPARC + Intel

  • IBM AIX

  • HP-UX

  • Mac OSX

  • Windows 32 bit and 64 bit

There are also custom ports of MySQL from the open source community for running MySQL on various embedded platforms and non-standard operating systems.

Hardware-wise, MySQL and correspondingly InnoDB, will run on a wide variety of hardware, which at the time of this writing includes:

  • Intel x86 32 bit

  • AMD/Intel x 86_64

  • Intel Itanium IA-64

  • IBM Power architecture

  • Apple's PPC

  • PA-RISC 1.0 + 2.0

  • SPARC 32 + 64 bit

Keep in mind when installing and configuring InnoDB, depending on the architecture in which it is installed, it will have certain options available and enabled that are not available on all platforms. In addition to the underlying hardware, the operating system will also determine whether certain configuration options are available and the range to which some variables can be set. One of the more decisively important differences to be considered while choosing an operating system for your database server is the manner in which the operating system and underlying filesystem handles write caching and write flushes to the disk storage subsystem. These operating system abilities can cause a dramatic difference in the performance of InnoDB, often to the order of 10 times the concurrency ability.

When reading the MySQL documentation you may find that InnoDB has over fifty-eight configuration settings, more or less depending on the version, for tuning the performance and operational defaults. The majority of these default settings can be left alone for development and production server environments. However, there are several core settings that can affect great change, in either positive or negative directions depending on the application workload and hardware resource limits, with which every MySQL database administrator should be familiar and proficient.

Keep in mind when setting values that some variables are considered dynamic while others are static; dynamic variables can be changed during runtime and do not require a process restart while static variables can only be changed prior to process start, so any changes made to static variables during runtime will only take effect upon the next restart of the database server process. Dynamic variables can be changed on the MySQL command line via the following command:

mysql> SET GLOBAL [variable]=[value];

If a value is changed on the command line, it should also be updated in the global my.cnf configuration file so that the change is applied during each restart.

MySQL memory allocation equations

Before tuning any InnoDB configuration settings—memory buffers in particular—we need to understand how MySQL allocates memory to various areas of the application that handles RAM. There are two simple equations for referencing total memory usage that allocate memory based on incoming client connections:

  • Per-thread buffers: Per-thread buffers, also called per-connection buffers since MySQL uses a separate thread for each connection, operate in contrast to global buffers in that per-thread buffers only allocate memory when a connection is made and in some cases will only allocate as much memory as the connection's workload requires, thus not necessarily utilizing the entire size of the allowable buffer. This memory utilization method is described in the MySQL manual as follows:

    Each client thread is associated with a connection buffer and a result buffer. Both begin with a size given by net_buffer_length but are dynamically enlarged up to max_allowed_packet bytes as needed. The result buffer shrinks to net_buffer_length after each SQL statement.

  • Global buffers: Global buffers are allocated memory resources regardless of the number of connections being handled. These buffers request their memory requirements during the startup process and retain this reservation of resources until the server process has ended.

When allocating memory to MySQL buffers we need to ensure that there is also enough RAM available for the operating system to perform its tasks and processes; in general it is a best practice to limit MySQL between 85 to 90 percent allocation of total system RAM. The memory utilization equations for each of the buffers is given as follows:

  • Per-thread Buffer memory utilization equation:

    (read_buffer_size + read_rnd_buffer_size + sort_buffer_size + thread_stack + join_buffer_size + binlog_cache_size) * max_connections = total memory allocation for all connections, or MySQL Thread Buffers (MTB)

  • Global Buffer memory utilization equation:

    innodb_buffer_pool_size + innodb_additional_mem_pool_size + innodb_ log_buffer_size + key_buffer_size + query_cache_size = total memory used by MySQL Global Buffers (MGB)

  • Total memory allocation equation:

    MTB + MGB = Total Memory Used by MySQL

If the total memory used by the combination of MTB and MGB is greater than 85 to 90 percent of the total system RAM then you may experience resource contention, a resource bottleneck, or in the worst case you will see memory pages swapping to on-disk resources (virtual memory) which results in performance degradation and, in some cases, process failure or connection timeouts. Therefore it is wise to check memory allocation via the equations mentioned previously before making changes to the memory buffers or increasing the value of max_connections to the database.

More information about how MySQL manages memory and threads can be read about in the following pages of the MySQL documentation:

Summary

This article provided a quick overview of the core terminology and basic features, system requirements, and a few memory allocation equations.

Resources for Article :


Further resources on this subject:


Books to Consider

comments powered by Disqus