Examining the Oracle Architecture

(For more resources related to this topic, see here.)

Understanding the Oracle RDBMS

The term RDBMS, or Relational Database Management System can be used in many ways and, more often than not, simply refers to a commercial database management product. However, the true meaning of the term RDBMS refers to the way the product operates internally to store and retrieve data. There are many commercial and open source database management systems available today. While they all attempt to interface with the user in a manner that is generally compliant with international standards, they all operate differently internally. We will explore the characteristics that make the Oracle RDBMS unique and, in doing so, understand how it works.

Distinguishing between an instance and a database

To define the Oracle database architecture, we must first define two terms and make a distinction between them. An instance is the set of background processes and memory structures that enable the Oracle kernel to operate. A database is the set of files that stores the data contained in the RDBMS. These two entities account for the three basic resources available to computers, namely the CPU, RAM, and disk. The instance represents the usage of the CPU and RAM, while the database represents the usage of the disk. These definitions are architecturally oriented, and the distinction between the two isn't always fully recognized. In fact, the two are often used interchangeably. In most database configurations, it is reasonable to do so since, on a single server, the instance and database operate together. However, it's important to draw a distinction at this point, for three reasons. First, the terms themselves are used differently with other RDBMS products, such as Microsoft SQL Server. Secondly, when investigating the Oracle architecture, it is simpler to see the relationship between the various architectural components if we divide them accordingly. Lastly, although an instance and a database generally operate together on a single server, other database configurations are possible. In Oracle's Real Application Clusters, or RAC, the instance and database are both logically and physically separated. We will examine the subject of RAC later in the article.

The following diagram gives us a broad look of the Oracle architecture as a whole. We will refer back to portions of it throughout the article:

Note that this is a logical diagram that groups similar functions together. We display this as a reverse tree diagram. The top of the tree is the RDBMS itself. From there, it branches into the instance and the database, the terms we have just defined. We will look at each branch of this diagram and examine the individual components, as well as how they operate and contribute to the inner working of the Oracle RDBMS.

Examining the Oracle instance

As we mentioned before, the Oracle instance is composed of the necessary processes and memory structures required to operate the Oracle kernel. The instance is the first thing invoked when the database is started.

Recognizing the primary Oracle processes

The first components of the Oracle instance that we will examine are the Oracle background processes. These processes run in the background of the operating system and are not interacted with directly. Each process is highly specialized and has a specific function in the overall operation of the Oracle kernel. While these processes accomplish the same functions regardless of the host operating system, their implementation is significantly different. On Unix-based systems, owing to Unix's multiprocess architecture, each Oracle process runs as a separate operating system process. Thus, we can actually see the processes themselves from within the operating system.

For instance, we can use the ps command on Linux to see these processes, as shown in the following screenshot. We've highlighted a few of them that we will examine in depth. Note that our background processes are named in the format ora_ processtype_SID. Since the SID for our database is ORCL, that name forms a part of the full process name:

On Windows, rather than implementing each Oracle process as a separate OS process, the Oracle processes are implemented as threads since Windows is a multithreaded operating system. As a result, the Oracle kernel runs under a single executable called Oracle.exe. The background processes then run as threads under that single process. So, if we attempt to see the Oracle background processes using Windows's Task Manager, we only see the Oracle.exe executable, as shown in the next screenshot. The threads representing the Oracle processes are masked to us, at least at the operating system level:

We can also display this information from the Windows command line by running the tasklist command:

Regardless of whether we can see each process directly from the operating system, once the instance starts, they are present. Each has a specific job and all run in concert to service the needs of the database.

PMON – the Process Monitor

The core process of the Oracle architecture is the PMON process—the Process Monitor. The PMON is tasked with monitoring and regulating all other Oracle-related processes. This includes not only background processes but server processes as well. Most databases run in a dedicated server mode. In this mode, any user that connects to the database is granted a server process with which to do work. In Linux systems, this process can actually be viewed at the server level with the ps -ef command. When the user connects over the network, the process will be labeled with LOCAL=NO in the process description. Privileged users such as database administrators can also make an internal connection to the database, provided that we are logging in from the server that hosts the database. When an internal connection is made, the process is labeled with LOCAL=YES. We see an example of each in the following screenshot of the ps –ef command on a Linux machine hosting Oracle:

Under ordinary circumstances, when a user properly disconnects his or her session from the database by exiting the tool used to connect to it, the server process given to that user terminates cleanly. However, what if instead of disconnecting the connection properly, the machine that the user was connected to was rebooted? In situations like these, the server process on the database is left running since it hasn't received the proper instructions to terminate. When this occurs, it is the job of PMON to monitor sessions and clean up orphaned processes. The PMON normally "wakes up" every 3 seconds to check these processes and clean them up. In addition to this primary function, PMON is also responsible for registering databases with network listeners.

The real-world DBA

Since the instance cannot run unless PMON is running, DBAs sometimes check for it using the ps command as a way of determining whether the instance is down, because, on Unix-based systems, we can actually see the processes at the server level using the command ps –ef | grep pmon. If a process is not returned, we know the instance is down.

SMON – the System Monitor

The SMON, or System Monitor process, has several very important duties. Chiefly SMON is responsible for instance recovery. Under normal circumstances, databases are shut down using the proper commands to do so. When this occurs, all of the various components, mainly the datafiles, are properly recorded and synchronized so that the database is left in a consistent state. However, if the database crashes for some reason (the database's host machine loses power, for instance), this synchronization cannot occur. When the database is restarted, it will begin from an inconsistent state. Every time the instance is started, SMON will check for these marks of synchronization. In a situation where the database is in an inconsistent state, SMON will perform instance recovery to resynchronize these inconsistencies. Once this is complete, the instance and database can open correctly. Unlike database recovery, where some data loss has occurred, instance recovery occurs without intervention from the DBA. It is an automatic process that is handled by SMON.

The SMON process is also responsible for various cleanup operations within the datafiles themselves. tempfiles are the files that hold the temporary data that is written when an overflow from certain memory caches occurs. This temporary data is written in the form of temporary segments within the tempfile. When this data is no longer needed, SMON is tasked with removing them. The SMON process can also coalesce data within datafiles, removing gaps, which allows the data to be stored more efficiently.

DBWn – the Database Writer process

For all of the overhead duties of processes such as PMON and SMON, we can probably intuit that there must be a process that actually reads and writes data from the datafiles. Until later versions, that process was named DBWR – the Database Writer process. The DBWR is responsible for reading and writing the data that services user operations, but it doesn't do it in the way that we might expect.

In Oracle, almost no operation is executed directly on the disk. The Oracle processing paradigm is to read data into memory, complete a given operation while the data is still in memory, and write it back to the disk. We will cover the reason for this in greater depth when we discuss memory caches, but for now let's simply say it is for performance reasons. Thus, the DBWR process will read a unit of data from the disk, called a database block, and place it into a specialized memory cache. If data is changed using an UPDATE statement, for instance, it is changed in memory. After some time, it is written back to the disk in its new state.

If we think about it, it should be obvious that the amount of reading and writing in a database would constitute a great deal of work for one single process. It is certainly possible that a single DBWR process would become overloaded and begin to affect performance. That's why, in more recent versions of Oracle, we have the ability to instantiate multiple database writer processes. So we can refer to DBWR as DBWn, where "n" is a given instantiation of a database writer process. If our instance is configured to spawn three database writers, they would be dbw0, dbw1, and dbw2. The number of the DBWn processes that are spawned is governed by one of our initialization parameters, namely, db_writer_processes.

Let's take a closer look at how the value for db_writer_processes affects the database writer processes that we can see in the Linux operating system. We won't go into great depth with the commands that we'll be using at this point, but we can still see how the spawning of multiple DBWn processes works. We will become very familiar with commands such as these as we revisit them frequently throughout many of the examples in this book. First, let's examine the number of DBWn processes on our system using the ps command, with which we're familiar:

From the Linux command line, we use the ps –ef command along with the grep command that searches through the processes in the system with the string dbw in their names. This restricts our output to only those processes that contain dbw, which will be the database writer processes. As we can see in the preceding screenshot, there is only one database writer process named ora_dbw0_orcl.

As mentioned, the number of the database writer processes is determined by an initialization parameter. The name of that parameter is db_writer_processes.We can determine the value of this parameter by logging into the database using SQL*Plus (the command sqlplus / as sysdba) and showing its value using the show parameter command, as in the following screenshot:

Since we've already determined that we only have a single dbw0 process, it should come as no surprise that the value for our parameter is 1. However, if we wish to add more database writers, it is simple to do so. From the SQL*Plus command line, we issue the following command, followed by the shutdown immediate and startup commands to shut down and start up the database:

The alter system command instructs Oracle to set the db_writer_processes parameter to 4. The change is recognized when the database is restarted. From here, we type exit to leave SQL*Plus and return to the Linux command line. We then issue our ps command again and view the results:

As we can see in the preceding screenshot, there are four database writer processes, called ora_dbw0_orcl, ora_dbw1_orcl, ora_dbw2_orcl, and ora_dbw3_orcl, that align with our value for db_writer_processes. We now have four database writer processes with which to read and write data.

The real-world DBA

What's the optimal number of database writers? The answer is that, as with many aspects of database administration, it depends. The parameter has a maximum value of 20, so does that mean more is better? Not necessarily. The simplest answer is that the default value, either 1 or the integer value resulting from the number of CPUs divided by 8 (whichever is greater), will generally provide the best performance. Most opinions regarding best practices vary greatly and are usually based on the number of CPUs in the host box. Generally, the default value will serve you well unless your server is very large or heavy tuning is needed.

CKPT – the Checkpoint process

We mentioned in the preceding section that the purpose of the DBWn process is to move data in and out of memory. Once a block of data is moved into memory, it is referred to as a buffer. When a buffer in memory is changed using an UPDATE statement, for instance, it is called a dirty buffer. Dirty buffers can remain in memory for a time and are not automatically flushed to disk. The event that signals the writing of dirty buffers to disk is known as a checkpoint. The checkpoint ensures that memory is kept available for other new buffers and establishes a point for recovery. In earlier versions of Oracle, the type of checkpoint that occurred was known as a full checkpoint. This checkpoint will flush all dirty buffers back to the datafiles on the disk. While full checkpoints represent a complete flush of the dirty buffers, they are expensive in terms of performance. Since Version 8i, the Oracle kernel makes use of an incremental checkpoint that intelligently flushes only part of the available dirty buffers when needed. Full checkpoints only occur now during a shutdown of the database or on demand, using a command.

The process in the instance that orchestrates checkpointing is the CKPT process. The CKPT process uses incremental checkpoints at regular intervals to ensure that dirty buffers are written out and any changes recorded in the redo logs are kept consistent for recovery purposes. Unlike the DBWn process, there is only one CKPT process. Although the incremental checkpoint method is used by CKPT, we can also force a full checkpoint using the command shown in the following screenshot:

LGWR – the Log Writer process

Redo logs are files that serially store the changes that occur in the database. These changes can be anything from INSERT, DELETE, or UPDATE statements executed against the database to the creation of new tables. Note, however, that queries against the database using SELECT statements do not constitute changes and are not recorded in the redo logs. The primary function of redo logs is to act during database recovery, where database changes can be "rolled forward" from a backup.

When a change occurs in the database, that change is first written into a memory buffer called the log buffer that is specifically tasked with handling database changes. It is then written into the redo logs. The process that writes changes from memory to disk is the LGWR process. There is only one LGWR process, so it is important that it moves data as efficiently as possible. There are three conditions that can occur to cause LGWR to move changes from memory into the redo logfiles:

  • When a COMMIT command is issued

  • When the log buffer fills to one-third of its capacity

  • Every three seconds

The last of these conditions, "every three seconds," is actually caused by the DBWn process. The database writer has a three-second timeout before it must write a limited number of dirty buffers to disk. When this occurs, the LGWR process also flushes its changes just before DBWn does its work. This ensures that uncommitted transactions can always be rolled back.

ARCn – the Archiver process

The ARCH, or the Archiver process, is an optional but very important process. Let's continue describing the life cycle of a change in Oracle. To review, as sessions change data, those changes are written into the log buffer. At periodic intervals, those changes are written out serially to the redo logs by the LGWR process. The number and size of the redo logs can vary, but there are always a limited number of them. When one redo log becomes full, LGWR switches to the next one. Since there are a finite number of redo logs, LGWR eventually fills them all. When that happens, LGWR switches back to the original redo log and overwrites the changes that exist in the log. This effectively destroys those changes and invalidates the purpose of storing the changes at all. When the database operates in this manner, overwriting changes that were stored in the redo logs, we say that we're operating in NOARCHIVELOG mode. In order to prevent changes from being overwritten, we must operate using a different mode—ARCHIVELOG mode. When the database is in ARCHIVELOG mode, the contents of the redo logs are written out to a different type of file called archive logs. Archive logs simply contain the data that was in the redo log and serve as its static copy. However, archive logs never overwrite each other. Thus, the history of changes stored in the archive logs and redo logs constitute all the changes that have occurred from a given point in time. The process that enables ARCHIVELOG mode is the Archiver process, or ARCH. The ARCH copies data from the redo logs to the archive logs and does so before the data in the redo logs can be overwritten.

Like DBWn, the Archiver process can run as multiple processes, albeit with a different purpose. In previous versions of Oracle, the Archiver process ran as a single process— ARCH. In recent versions, it is more accurate to call the Archiver process by its proper name, ARCn, where "n" is the number for one of multiple Archiver processes. When ARCn runs as multiple processes, it does so in order to write to multiple locations, rather than using multiple processes to write to a single location.

Since, along with the redo logs, the archive logs contain all of the changes that have occurred in the database, they are critical to database recovery. As such, we can configure our database to write out archive log copies to more than one location. For instance, we may want to configure the Archiver process to write archive logs to two different locations on disk or write a copy of them out to a tape drive to mitigate the risk of data loss.

Recognizing the secondary Oracle processes

We've seen what we referred to as the primary Oracle processes. These are the instance processes that are the foundations of how the Oracle architecture functions. While there is no official definition distinguishing between primary and secondary processes, we use the distinction here as a way of separating those processes that function as an integral component of the architecture and those that make other, secondary operations possible.

MMON – the Manageability Monitor

Prior to Oracle Version 10g, database performance tuning was accomplished primarily using data dictionary views. Oracle's extensive data dictionary provided a great deal of insight into the inner workings of the database. However, these views had limitations as to how much internal data was stored and how often it was updated. In short, the performance tuning needs of today's databases required a more extensive interface into Oracle. With Version 10g, the Oracle database included what amounts to a second data dictionary, the Automatic Workload Repository (AWR), that focuses solely on performance tuning metrics. The MMON process, the Manageability Monitor, extracts these metrics from the Oracle memory caches and writes them to the AWR. MMON essentially takes point-in-time snapshots of performance data, allowing the data to be used in trend analysis. MMON also invokes the ADDM, the Automatic Database Diagnostic Monitor, which analyses these metrics and can offer performance optimization suggestions in the form of a report. MMON is assisted by another process, MMNL, the Manageability Monitor Light, to gather these statistics. The following screenshot displays some of these secondary processes:

MMAN – Memory Manager

While creating database we are given the opportunity to configure our database's memory in one of three ways. We choose to configure the memory caches manually, but we are also offered Automatic Memory Management (AMM) and Automatic Shared Memory Management (ASMM) as options. These two options allow us to turn over the sizing of instance memory caches to Oracle's automatic management algorithm. In environments that run in either of these two configurations, it is the MMAN process, or Memory Manager, that controls the allocation and deallocation of memory between the various memory caches.

RECO – the Recoverer process

The Oracle database can be run in many different configurations. While we normally think of database operations as running on a centralized server, we can also configure them to run in a distributed fashion. For instance, we may have a system with three databases, one in London, one in Berlin, and one in Miami, that are required to always contain and present the same data, regardless of which location is used. In such a distributed configuration, a transaction that runs on one database must also replicate to another database or databases. We refer to these operations as distributed transactions. Because of the complexity of maintaining operations such as these, the Oracle instance uses a separate process called RECO, the Recoverer process, to ensure that distributed transactions commit properly at all distributed sites. Failing that, RECO rolls back the transaction at each location to ensure data integrity.

Recognizing other database processes

The Oracle instance contains several other processes that are less commonly referred to. A DBA does not typically need to be concerned with the operation of these processes, but we include them here for the sake of curiosity and completeness. There are also other processes that are particular to an Oracle installation on any given operating system:

  • Dnnn: The dispatcher process that is responsible for distributing SQL calls to shared server processes Snnn. These processes are only used in a shared server configuration.

  • DBRM: The database resource manager that controls resource allocations in databases that use the Resource Manager feature.

  • DIAG: The diagnosability process that performs diagnostic resource dumps in the event of certain error conditions.

  • FBAR: The flashback archiver process that controls the operation of Oracle's flashback data archive feature, which allows a user to query data as it was during a time in the past.

  • CJQn: The job queue coordinator that controls the job queue and routes various jobs to slave processes, the Jnnn processes, for execution.

  • SMCn: The space management coordinator that controls database space related tasks involving allocation and cleanup. It uses the Wnnn subprocesses to accomplish these operations.

  • VKTM: The virtual timekeeper process that keeps track of time. This is especially important in distributed or clustered environments.

So, by way of review, let's look at the section of our RDBMS diagram that deals specifically with the background processes and note how it relates to the rest of the instance and database components:

Identifying Oracle's memory caches

Throughout our review of Oracle's process structure, we've mentioned some of them in passing. The complex specialization of Oracle's memory caches is impressive. Every opportunity is taken to separate different types of data into different caches for the most optimized approach to high performance. At this point, we'll identify these caches and their purpose in the Oracle instance.

Understanding the SGA

The most well-known of Oracle's caches is the SGA. The SGA, or System Global Area, is a set of memory caches that generally pertain to memory that must be shared between users. As such, it might be better referred to as the Shared Global Area. The concept of sharing memory between users is unique to systems that function as servers. In simpler systems, such as a desktop computing environment, resources such as files may be shared, but the idea of actually sharing the RAM of different personal computers is relatively foreign. In such systems, RAM acts as a volatile work area of higher speed that can be used for processing. So if you open a word processing document on your desktop, its contents are read into memory for faster manipulation. You don't, at the same time, "share" your memory with another user down the hall who can manipulate your document. In such situations, memory is more of a private resource than a shared one.

In shared resources such as database systems, processing is more efficient if some amount of information can be shared or even re-used between users. So, an enterprise-level RDBMS makes use of memory that is defined at the server level and presents it in a manner in which it can be accessed by multiple users. When one considers that Oracle databases can service thousands or even hundreds of thousands of users simultaneously, it's impressive to grasp that memory can be shared between each of them.

While the SGA is a collection of other caches, its overall size is determined by an initialization parameter, sga_max_size, at the discretion of the DBA. We noted during our database creation with DBCA that memory can be managed automatically by defining other parameters, but it can also be sized manually. When we manage the SGA manually, the value for sga_max_size represents the sum total of each of the caches within it. Prior to Oracle Version 9i, it was not possible to resize the individual caches without shutting down the database, changing the appropriate parameter, and restarting (much to the chagrin of many DBAs). From Version 9i onwards, it has been possible to resize individual caches dynamically without restarting the database, provided that their sum total does not exceed the value established by sga_max_size. We can think of this parameter as an outer boundary for all of the caches within the SGA. While individual caches can be changed dynamically, the value of sga_max_size cannot.

Examining the database buffer cache

Let's review the job of the DBWn process that we covered earlier. We learned that the DBWn process reads data from the disk and places it into a specialized memory cache. We also learned that the smallest unit of data that can be read in Oracle is called the database block. When DBWn reads a block of data, it places it in memory in the form of a database buffer. The area of the SGA in which the buffer is placed is known as the database buffer cache. Why is it important that Oracle operates in this way? We learned earlier that it was for performance reasons. To be more specific, Oracle operates under the assumption that if one user is requesting a certain block of data, other users may request it as well. When data is read into a database buffer, it remains in the database buffer cache so that other users may access it without the need to do a relatively expensive read from disk. Doing so creates a mechanism by which "hot" data, data accessed by many users, remains in memory for as long as possible, without re-reading from disk. The Oracle kernel uses an algorithm to define this mechanism known as the Least Recently Used algorithm, or LRU. Note that the use of the LRU algorithm doesn't mean that the oldest data is written back to disk. It means that the data that has been accessed most recently is retained, while data that has not been accessed recently is aged out of the cache. In this way, the number of disk reads is efficiently reduced.

When manually sizing the database buffer cache, its size is determined by the initialization parameter db_cache_size. As we've mentioned, it is possible to dynamically resize this value within the bounds of sga_max_size. The process of doing this is shown, in the following screenshot, using an Oracle database on Windows as an example. First, we log in to the database from the command prompt and determine the current value using the show parameter command:

We can see from our results in this example that the value for db_cache_size is 192M. We can change this value dynamically using the alter system command:

As we can see in the preceding screenshot, we use the alter system command, specifying the new value 196M (M for megabytes) for db_cache_size. We also indicate that we wish to make this change in both the current memory and in the parameter file itself using scope=both. This will ensure that our change will remain in effect after a database restart. In the event that we attempt to increase the size of the cache to a value that causes the sum total of SGA caches to exceed that of sga_max_size, we will receive the following ORA-00384 error:

Ensuring that the various caches are properly sized is vital for system performance, and doubly so for the database buffer cache. If our cache is too large, our database will allocate unneeded memory that could be used for other operations. If the cache is too small, our database won't be able to keep the necessary buffers in memory and will, instead, age them out too frequently and provoke an inordinate amount of disk activity.

Examining the shared pool

Our second cache within the SGA manages memory using a philosophy similar to the buffer cache, but in an entirely different way. The shared pool is used to cache objects in memory, but not database blocks. The shared pool is a cache composed of four areas, namely the library cache, the data dictionary cache, the PL/SQL area, and the new result cache.

Consider for a moment the following SQL statement:

select first_name, last_name from employee where employee_id = 5;

While we normally don't think about all of the steps necessary in order to run this query, there are many. The Oracle kernel must answer several questions like the following before executing even a simple statement such as the preceding one:

  1. Is this statement syntactically correct in every respect?

  2. Does the employee table exist in the database?

  3. Do the first_name and last_name columns exist and are they a part of the employee table?

  4. Does the user executing the statement have permission to query from the employee table?

  5. What is the most efficient way to retrieve the rows? Should every row in the table be scanned for matching values or should an index be used?

All of these steps and more must occur every time any SQL statement is executed against the database. In addition, the SQL code must be translated into machine code that can be efficiently understood and executed by Oracle. These steps are referred to as parsing. All the steps required for a database parse can make it an expensive operation in terms of resources. While our statement is being parsed, it's not being executed. Obviously, there is no way to avoid parsing a statement the first time it is used, but what about subsequent requests? Often, in applications such as those that deal with reporting, the same or similar statements are executed many times by different users. Without a way to retain the parsed statements, every statement requested must be reparsed.

To avoid this, Oracle employs the library cache, a memory area within the shared pool that stores parsed statements. Thanks to the library cache, similar statements are not required to be reparsed every time they are requested. The specific location in the library cache that contains this information is referred to as the shared SQL area. Oracle makes this distinction because another area called the private SQL area exists in the Program Global Area (PGA), which we discuss later in the article.

How does Oracle decide if a statement is "similar" enough to be read directly from the library cache instead of being re-parsed? The library cache utilizes a unique hashing algorithm that takes the statement and breaks down each character in the statement into its corresponding ASCII value. It then sums those values and determines a value that represents that statement. When a matching value is found, it is checked against the statement in the library cache. The parsed statement components of that match are used instead of reparsing, improving overall application performance. This algorithm is efficient, but results in a crucial fact that must be considered. Since the basis for the hashing algorithm is the ASCII values of the characters in the statement, the slightest deviation between two statements that are intended to be matching will result in a reparse. For instance, because ASCII values vary according to the character's case, either upper or lowercase, the clause SELECT is not the same as select. Each will result in a different ASCII value total and cause the hashed statement values to differ. If the hashed values differ, the second statement will not be seen as a match and a re-parse will occur.

The real-world DBA.

The fact that differences in case and syntax will "throw off" Oracle's hashing algorithm is an excellent example of why coding standards are important. If application developers are to write SQL code in a way that it can be re-used, taking care to use standardized syntax, the performance of the application can be improved greatly. While it may seem like a small point, there is no reason to perform two parses when one is sufficient. In certain systems, you may save on not just a few but perhaps thousands of re-parses.

We mentioned earlier that one of the "questions" that must be answered during the parsing phase dealt with the existence of tables or columns within tables. In Oracle, the structure of tables, columns, and indeed all database objects is stored within the data dictionary. If you recall during our manual database creation, the data dictionary is created after the base database structure is created by running the catalog.sql script. Since the data dictionary is queried to answer these questions during a parse, it would be advantageous to avoid rereading them repeatedly. The shared pool provides an area where this can be accomplished.

The data dictionary cache resides in the shared pool and provides a memory area to store database object definitions, such as table and column structures. Using this cache, the parser can read frequently accessed object definitions from the cache instead of from the dictionary tables on disk.

SQL statements are not the only type of code that is executed within the database. There are actually three programming languages that can be natively run within Oracle, namely SQL, PL/SQL, and Java. Just as SQL statements can benefit from storing parsed code in memory, so can these languages as well. We normally think of PL/SQL (Procedural Language/SQL) as the most common of these, so the PL/SQL area provides a cache where PL/SQL code, such as procedures, packages, functions, and triggers, can be stored after being read from the data dictionary. Applications that make extensive use of PL/SQL can greatly benefit from the operation of the PL/SQL area. Incidentally, Java's stored procedures also make use of this cache.

Our final area within the shared pool is new in Version 11g. As we've noted, some applications execute similar statements repeatedly. It is also true that these statements may return the same set of results, for instance, from a query that is repeatedly executed. In such situations, it would be beneficial to actually cache these results so that they might be available to other sessions. Within the shared pool, it is the job of the result cache to perform this operation. The result cache caches the results of SQL queries and PL/SQL functions, making them available to other sessions. Note that since not all applications require this level of caches, the result cache is not enabled by default. It must be enabled programmatically in one of several ways.

Note that, although the size of the shared pool itself can be sized either automatically or manually, the sizes of the various areas within the shared pool are automatically controlled by the Oracle kernel. This underscores the importance of properly sizing the shared pool since an undersized shared pool can lead to performance problems in all of its respective memory areas. The shared pool is sized manually by adjusting the shared_pool_size parameter.

Utilizing the log buffer

Our next SGA memory component is one that we've already mentioned. Recall from our discussion of the LGWR process that the log buffer is used to store database changes as they occur. They are then written out of memory in batches by the LGWR process. Remember the three conditions that can occur to cause LGWR to move changes from memory into the redo logfiles:

  • When a COMMIT command is issued

  • When the log buffer fills to one-third of its capacity

  • Every three seconds

The statements that create these changes, such as DML and DDL statements, benefit from a memory cache that can be read from and written to in a quicker fashion. The log buffer provides this cache. The log buffer is unique among the caches in the SGA in that it is the only cache that cannot be dynamically resized. The size of the log buffer is determined by the log_buffer parameter and can only be altered with a restart of the database. The speed at which changes can be read and written from the buffer is crucial to overall database performance, and an improperly sized log buffer can impact the efficiency of these operations. That being said, the default size of the log_buffer parameter is sufficient for most databases.

Understanding the large pool

Owing to the fact that so many database operations can benefit from intelligent caching, Oracle added a versatile cache in Version 9i that is slightly less specialized. The large pool is a cache within the SGA that can be utilized for certain operations, freeing up the memory use of the shared pool. The large pool is particularly useful in environments that use the shared server architecture. In a shared server architecture, a finite number of server processes address the needs of every session, rather than giving each session a dedicated server process. Memory from the large pool can be allocated to sessions that use the shared server processes. The large pool can also be used to cache data involved in backup and recovery operations, greatly improving the performance of these long-running operations.

Examining the streams pool

Oracle Streams is a product that allows data to be dynamically replicated between databases. We can think of Streams as the successor to Oracle's Advanced Replication option, with some significant improvements. Oracle Streams propagates data between databases by mining the redo logs for changes, queuing them, and then pushing them to corresponding databases. The process of mining the redo logs in Streams benefits from having its own memory cache, and that cache is the streams pool. The streams pool is not utilized unless Oracle Streams is in use and can be ignored in cases where it's not. The streams pool is sized manually using the streams_pool_size parameter. Note that although Oracle Streams is outside the scope of the exam, the concept of the streams pool itself as one of the memory caches may be covered.

Understanding the Java pool

The Java programming language is one of the most important languages in use today. Oracle was quick to capitalize on its growing importance in the IT industry by adding a great deal of support for Java within its flagship database. We mentioned earlier in the article that the Oracle database supports the use of Java stored procedures. These units allow for Java programming code to be stored as objects within the database. If an IT organization uses Java stored procedures, the Java pool can be extremely beneficial. The Java pool is a specialized memory cache within the SGA that is used to instantiate Java objects. It is generally best for a DBA to consult with Java application developers to determine the optimal size for the Java pool. The Java pool size is determined by the java_pool_size initialization parameter.

To summarize the structure of the SGA and how it fits within the architecture of the Oracle instance, we revisit our overall architecture diagram:

Beyond the exam

In Unix-based systems, there are operating system-level parameters dealing with shared memory that must be considered beyond just the Oracle parameters. These OS parameters provide a great deal of flexibility for configuring the shared memory on a server. These parameters must be configured properly to ensure optimal system performance, usually in coordination with the system administrator:

  • shmmax or shared memory max refers to the maximum size of a shared memory segment. This should be sized to contain the entire SGA.

  • shmmin or shared memory min refers to the minimum size of a shared memory segment.

  • shmseg or shared memory segment refers to the maximum number of shared memory segments that can be attached to a process.

  • shmmni or shared memory identifiers refers to the number of shared memory identifiers in the system.

Understanding the PGA

The second major memory component of the Oracle instance is the PGA or Program Global Area. We will find that the PGA "side" of the Oracle instance is somewhat less complex than the SGA. It might be more accurate to refer to the PGA as the Private Global Area. So how can a memory area be both private and global at the same time?

The principle behind a memory area such as the SGA is that there are portions of data that should be available to be shared between user sessions. It is equally true that not all operations include data that should be shareable. Take a sorting operation, for instance. During a sort operation, such as a SELECT query involving an ORDER BY clause, it is advantageous to use memory to do the sort, since using disk would be a much slower process. However, sorts can be notoriously large and demanding in terms of resources. It wouldn't be efficient to sort in a memory area such as the SGA, since the sort operations may rob memory from other important areas, such as the database buffer cache. Ideally, such operations should run in memory where possible and write to disk in situations where the memory overflows.

Historically, Oracle accomplished this by allocating "slices" of memory to each user session. An initialization parameter defined a sorting area in memory, outside of the SGA, and allocated that area to a user when the user connected. That slice of memory could be used for sort operations. If the sort required more memory than the slice provided, the data would be written out to a temporary area on disk. The results were then returned to the user as an ordered set of rows. Similar portions of memory were granted to users for actions that required hashing algorithms and other related operations. While this paradigm accomplishes the goal of providing users with a private memory area for such operations, it is somewhat inefficient. Using this method, every user is given the same amount of memory for private operations whether it is needed or not. Additionally, because the total amount of memory used by these slices is determined by the number of users that access the database at a given time, the total system memory has to be taken into account. For instance, if we estimate that a database has a maximum of 1000 users connected at any given moment and each is given 1 M of memory for sorting, we must allow for 1000 M of memory for these users. However, the number of users that access databases isn't always consistent. At certain times of the day, for instance, only 100 users may be connected. These users are only allocated a total of 100 M, but since we must account for the maximum number of users, we can't allocate the other 900 M in any other way. This leads us to a situation where, at any given time, our database may be overallocated in terms of memory and the users that have the allocations may not even be using them.

To combat this problem, Oracle introduced the concept of the PGA, or Program Global Area, in Version 9i. The PGA is a single cache that provides the private memory needed for user sessions. Rather than allocating individual work areas in memory, the PGA allows us to set a parameter value, pga_aggregate_target that defines the total amount of memory to be allocated for user sessions. Within the PGA, memory allocation is managed by Oracle, giving more to sessions that require it. As a result, user sessions that require more memory receive it, making operations such as sorting much more efficient, since having more memory available requires less use of temporary space on disk. Additionally, the private SQL area within the PGA contains pointers that refer back to the shared SQL area within the SGA, facilitating the efficient sharing of SQL statements. The PGA stores several types of user-session information after sorting operations are performed:

  • Data in sorting operations

  • Session variable information

  • Bind variables

  • Query execution work areas

  • Cursor information and state

Starting with Oracle 11g, the PGA and SGA can be sized together using a feature called Automatic Memory Management (AMM). This simplifies the administration tasks involved in determining memory structure sizing . To review the position of the PGA within the Oracle architecture, we refer to our architecture diagram:

This completes our examination of the Oracle instance architecture. As a full review of all instance components, we refer back to our architecture diagram, where we can see the greater context of how these individual components relate:

Examining the Oracle database components

Although we typically use the term "database" to refer to an RDBMS in general, in this article we're examining it in its most literal sense. We've made the distinction between an instance and a database and have seen the components of the instance that are the background processes and memory caches. In the most literal sense, the term "database" refers to the set of files that make up the Oracle RDBMS. Each type of file serves a different function within the infrastructure of the RDBMS.

Understanding the control file

Each component of the Oracle RDBMS is highly specialized, from processes that serve a certain function to caches that contain a certain type of data. However, there must be a central place, a "brain" of sorts, that stores information about the database at any given moment. In Oracle, that central storage file is called the control file. The control file contains a multitude of time-sensitive information dealing with the state of the database. Some of this information includes, but is not limited to, the following:

  • The names and locations of all the database files

  • The name of the database

  • The current redo log sequence number

  • The time and date when the database was created

  • Recent archive log names and locations

  • The current SCN (System Change Number)

  • Backup and recovery information

Despite the wealth of information stored in the control file, it is surprisingly small. We can see this by actually viewing the control file from the operating system. In the following screenshot, we examine the control files on a Windows installation of Oracle within the directory structure we provided to DBCA during database creation:

Note that the two control files shown here are less than 10 M in size. The control file is very efficient in the manner in which it stores critical database information. It is both read from and written to very frequently in order to be kept informed of the state of the database. Here, the two control files are named CONTROL01.CTL and CONTROL02.CTL. This is the standard way of naming control files, with a number that distinguishes it and the .CTL suffix. Here, we show a similar example of a control file on the Linux platform:

Notice how in our first example our system displayed two control files while the second showed only one. Also note that, in the first example, the two control files have exactly the same size. This is because, when using multiple control files, each one is an exact copy of the other. They contain exactly the same information. Then what would the benefit be of having more than one? Because of the crucial nature of the control file, it is highly recommended to multiplex it. Multiplexing is the act of making more than one copy of the control file for protection against file corruption or loss. The loss of a control file can be a catastrophic event. It is generally recommended that your database have no less than two control files, preferably three. It is also highly recommended that your control files exist on more than one physical disk. In a disk-failure situation, having three control files on the same failed disk will not save our system. The location and name of our control files is defined by the control_files initialization parameter.

The actual filename and path within the operating system must match what is defined by the parameter. From within the data dictionary, we can also query the v$controlfile view to see control file information, as shown in the following screenshot:

Exploring the redo logfiles

We've already discussed Oracle's redo log structure considerably in our discussion of both the redo log buffer and the LGWR process. By way of review, we learned that when changes occur in the database, they are recorded from a backup for the purpose of database recovery. They are first written to the redo log buffer and then written out by the LGWR process to the redo logfiles. Thus, the online redo logfiles contain the changes that occur in the database in sequential form. In the event that a database recovery is required, the redo logs can be read and the changes therein can be applied to the backup to "roll forward" a database into a desired state. Let's take a look at how LGWR writes to the redo logfiles:

For the sake of argument, let's say that LGWR begins writing to redo01a.log, the first redo logfile in the database. LGWR writes buffers from the log buffer to redo01a.log in batches. It continues to write until redo01a.log is full. At that point, a log switch occurs and LGWR begins writing to redo02a.log, the second redo log. Once it's full, another log switch occurs and LGWR begins writing to redo03a.log. Once that log is full, a third log switch occurs. At this point, LGWR begins writing to redo01a.log again, overwriting its contents, and the process continues. As we pointed out earlier in the article, overwriting a redo log defeats the purpose of redo data since the point is to preserve the data necessary for recovery. For that reason, ARCHIVELOG mode is used to ensure that a database is fully recoverable. More on that shortly.

The preceding diagram is the simplest representation of the way redo data is written. Each redo log in the previous example is the member of a group. Thus, redo01a.log is a redo log member of redo log group 1. The other logfiles be long to groups 2 and 3 respectively. Each redo log group can have multiple members. Why would this be necessary? As with the control file, the redo logfiles are crucial to database operations as they deal with database recovery. If a single member is lost or corrupted, the chain of database changes is broken and a database cannot be recovered fully. Owing to this fact, it is highly recommended that we duplex redo logs in a similar fashion to multiplexing control files. In duplexing redo logs, we create multiple redo log members in each group. Each member within a group is an identical copy of the other. So even if one member is lost or corrupted, the chain of redo remains unbroken. Taking this new information into account, we might better represent the redo architecture in the following diagram:

Here, each member of a group is written to simultaneously. The group currently receiving writes from LGWR is called the current log group. When a log switch occurs, LGWR begins writing log buffers to each member of the next group simultaneously and that group becomes the current log group. By default, DBCA creates three, non-duplexed log groups with one member each. At a minimum, the database must contain at least two log groups with one log member each, since LGWR must be able to switch to a different group during a log switch and cannot switch to a non-existent group.

We can view the redo logfiles from the operating system itself. Our default database created by DBCA will place the redo logs in the same directory as our control files and other database files. An example of this is shown in the following screenshot in a Linux installation of Oracle. The Linux command displays files that begin with the name redo. The redo logs can be viewed in a Windows installation of Oracle using Windows Explorer:

Understanding the archived redo logs

We've discussed the manner in which the Oracle redo architecture operates. We also mentioned that database changes stored in the redo logs are overwritten during log switches if ARCHIVELOG mode is not enabled.

ARCHIVELOG mode instructs the database that we intend to keep a static copy of the contents of each redo log. In this mode, when a log switch occurs, the contents of the current redo log are written out to an archived redo log by the ARCn process. These logs are also referred to as offline redo logs or simply archive logs. Let's extend our redo architecture diagram to include ARCHIVELOG mode and archived redo logs:

Each time a log switch occurs, the contents of the current redo log are written to an archived redo log. In the preceding example, let's say that redo log group 1 is the current group. When a log switch occurs from group 1 to group 2, the identical contents of redo01a.log and redo01b.log are written to an archived redo log arc001.log. Since the two redo log members contain the same data, only one archived redo log needs to be written. The changes in arc001.log called change vectors are the same as those found in redo01a.log and redo01b.log. This continues with groups 2 and 3. When group 3 completes a log switch back to group 1, the members of group 1 are overwritten. However, since we're running ARCHIVELOG mode in this example, we're not concerned with it. The contents of the members of group 1 have already been written to arc001.log and can be safely overwritten. When the members of group 1 have been filled a second time, a log switch occurs and a new archived redo log called arc004.log is written. By so doing, all the changes in the database are retained. The oldest changes are found in the oldest archived redo logs, continuing through the remainder of the newer archived redo logs, and then the newest changes are found in the redo logs themselves. When running in ARCHIVELOG mode, a database can be recovered up to the last write that occurred from the redo log buffer to the redo logfiles.

The naming and directory destination of archived redo logs are determined by the parameter values log_archive_format and log_archive_dest_1, respectively. As we mentioned earlier, the ARCn process can be configured as multiple processes that write to multiple destinations. These other destinations can be other filesystem directories, tape drives, and so on. We use the log_archive_dest_1 parameter when our database is configured to have one destination, the log_archive_dest_2 parameter to configure a second destination, and so on. We can use the archive log list command from a SQL*Plus prompt to determine our archiving status, as shown in the following screenshot; these commands work in Windows and Unix-based systems:

From the information presented here, we can see from the values for Database log mode and Automatic archival that our database is not in ARCHIVELOG mode. We can also see the directory to which archived redo logs would be written by Archive destination, were ARCHIVELOG mode to be enabled. Lastly, we see the value for the current redo log sequence number from Current log sequence. Each time a log switch occurs, this value increases.

Examining Oracle's core infrastructure datafiles

The heart of any database is the data that it contains. The importance of concepts such as high availability, high performance, and high recoverability all stem from the idea that the database exists for the purpose of preserving and delivering data. The structure that contains this data is of the greatest importance. In Oracle, that structure is known as a datafile. All data that is readable to user sessions is stored in datafiles. The data contained in memory caches is transient—it is dynamic and can be lost by a simple loss of power to the database server. The data contained in datafiles is persistent—it survives beyond a server shutdown in non-volatile memory such as that found on the platters of a hard drive. Further, while the size of memory caches is limited by the total amount of memory in the system, the total data contained in the datafiles is essentially unlimited. Every datafile has a name associated with it called a tablespace. A tablespace is essentially the logical manifestation of a physical datafile. So, rather than referring to table data that is stored in E:\APP\ORACLE\ORADATA \ORCL\USERS01.DBF datafile, we can simply refer to it as the USERS tablespace. A tablespace can be composed of multiple datafiles, but a datafile can belong to only one tablespace. It's important to understand that a tablespace is only a logical name, as the physical files that make up a tablespace are stored on disk and can generally be viewed using operating system tools.

While the majority of customer data is held within datafiles, Oracle also keeps its own metadata within datafiles. So, there are certain datafiles that exist in any Oracle database, which we might refer to as its infrastructure. In the same manner as background processes and memory caches, each has a particular function.

Understanding the SYSTEM tablespace

In Oracle, table data is stored in datafiles. It follows, then, that the data dictionary must reside in datafiles as well. In fact, the data dictionary exists in the datafiles of the SYSTEM tablespace. Because it holds all of Oracle's metadata about itself, the SYSTEM tablespace is the most primary tablespace of all. Every datafile is given a number when it is created. The first datafile that makes up the SYSTEM tablespace is always given the number 1 since it is created first. The standard naming convention for the SYSTEM datafile is system01.dbf, with any additional datafiles being suffixed with 02, 03, and so on. The SYSTEM datafile is one of only two that are required to create a database, the second being the datafile for the SYSAUX tablespace.

Understanding the SYSAUX tablespace

Prior to Oracle Version 10g, getting performance-tuning metrics from the database could be a complicated task. Doing so required an extensive knowledge of the data dictionary, which is vast in its own right. As a result, there were many different performance-tuning methodologies, each with its own vocal proponents. With  Version 10g, Oracle attempted, with a high degree of success, to consolidate (or perhaps replace) many of these methods with a single set of performance-oriented metadata tables called the Automatic Workload Repository, or AWR. As we mentioned earlier in the article, the MMON process collects performance information for the AWR in the form of snapshots containing database statistics—various point-in-time performance measurements. Sets of statistics can be compared to form metrics or measurements of statistical change, over time. The AWR is, in essence, a second data dictionary, one that focuses exclusively on database performance metrics. This data must be stored persistently in the database in order to be useful. The designers of Oracle 10g chose not to store this information in the SYSTEM tablespace where it might create contention with the data dictionary tables. Instead, they separated this data into its own tablespace—the SYSAUX tablespace. The SYSAUX tablespace contains the data that makes up the AWR. The amount and scope of the performance data stored in the datafile of the SYSAUX tablespace is immense. So, the AWR has its own mechanisms for aging this data out of the SYSAUX tablespace, keeping it to a reasonable size. The standard convention for naming the SYSAUX tablespace datafile is sysaux01.dbf.

Exploring the temporary tablespace

In our earlier discussion of the PGA, we mentioned the manner in which Oracle handles sorting. We learned that when a sort requires more memory than can be allocated in the PGA, the data is written out to the temporary tablespace. The temporary tablespace serves as a storage area for any temporary objects. When operations occur in the PGA, it is possible that the memory allocations given to users could expand beyond the boundaries that have been established. When this occurs, this PGA memory can be written out to temporary segments into the temporary tablespace, preventing the exhaustion of the PGA's memory for a user operation, such as a large sort. When the operation for a given session is complete, these segments are rejoined and returned to the user session. While we normally associate the temporary tablespace with the temporary segments that form during an overflow of a sort in the PGA, it is also used for many types of operations, including table joins and index creation. It can also store a special type of table known as a temporary table. The temporary tablespace allows us to make a distinction between permanent objects, such as tables, and temporary objects, such as the temporary segments from a sort.

In early versions of Oracle, this distinction was more or less in name only. In more recent versions, the structure of the temporary tablespace and the segments it holds has undergone a structural change that allows them to operate more efficiently. We even make a distinction between the files that make up a permanent tablespace and the files that make up a temporary tablespace. Instead of datafiles, we refer to them as tempfiles—the files that make up a temporary tablespace.

Understanding the Undo tablespace

Majority database operations exist in the form of a transaction—a unit of work done within the database. Transactions are formed when a DML statement is paired with a transaction control statement. A DML statement that initiates a transaction can be an INSERT, UPDATE, or DELETE statement, and transaction control is executed using a COMMIT or ROLLBACK statement. The primary concept behind a transaction is the understanding that the outcome of a DML statement is not truly realized within database tables until a transaction control statement has been executed. For instance, say that we change 100 rows in a table using an UPDATE statement. That change is not completed within the table until a COMMIT statement is executed. Although the change appears to the user that executes the UPDATE statement, no other user can see the change until COMMIT occurs. In a sense, the changes are not "real" until the UPDATE occurs. Why would an RDBMS operate in this way? Transaction control allows the RDBMS to operate under the principles of consistency that guarantee the reliability of transactions. Under this model, a transaction is not truly complete unless all statements complete. Transactions complete as a unit. This also allows for the concept of a ROLLBACK statement. If an incorrect statement occurs, the user is allowed to "roll back" that statement to its previous state.

In order for the principles of transaction control to work, a location for storing the before-image of the data—data as it was before a change was made—must exist. In the example of our 100 row UPDATE statement, the data that was in the columns before the statement was executed must be stored somewhere so it can be referenced in the event of a rollback. In Oracle, this data is primarily stored in the Undo tablespace. The Undo tablespace stores before-image transactional data in the form of undo segments. In previous versions of Oracle, before-image data was stored in the form of rollback segments. When a user session was created, a rollback segment was assigned to the user to store this transactional data. However, the rollback segment architecture was finite in its ability to assign disk resources for transactions, leading to many failed transactions. In later versions, Oracle introduced the concept of automatic undo management. Automatic undo management allows users to allocate undo segments for transactional storage from a pool of disks—the Undo tablespace. This creates a more flexible environment in which a user can be allocated more disk space for before-image data in the event of long-running DML operations that change a large amount of data.

Understanding non-specialized datafiles

While our previous examples of datafiles form the basic infrastructure of the database, the remainder of the physical file structures in Oracle are made up of the standard datafiles that contain application data. In most production databases, these datafiles take up a vast majority of the actual space used within the database. Datafiles such as these are non-specialized to any particular operations, other than the storage of database objects. Again, a tablespace is the logical name given to any set of datafiles.

The real-world DBA

Typically, database administrators create tablespaces with the goal of segregating different types of data. If an application uses certain tables that contain corporate financial data and other tables that represent personnel data, these two sets of tables are generally stored in two separate tablespaces—a FINANCE tablespace and a PERSONNEL tablespace, perhaps. Such a standard leads to a logical separation for clarity but can also improve database performance.

Taking our datafile architecture into account, we find that the maximum theoretical limits on the size of our database are almost difficult to imagine. The maximum size of a datafile in Oracle is 4,194,304 times the database block size. Given that the maximum size of a database block is 32 K, this gives us a max datafile size of 4,194,304 x 32,768 bytes equaling 137,438,953,472 bytes or 128 GB. Taking into account that the maximum number of datafiles present in our database can be up to 65,533, this allows us a maximum size of nearly 8 Petabytes for the database. However, Oracle also allows us to use a special kind of datafile called, a bigfile in tablespaces, bigfile tablespaces. Bigfiles can be an order of magnitude larger than typical datafiles (called smallfiles in this context), with a maximum size per bigfile of 128 TB. Thus, the maximum theoretical size limit for an Oracle database using bigfiles is 128 TB x 65,533 files, equaling 8 Exabytes or 9,007,199,254,740,992 bytes.

To complete our look at the database architecture, we refer once more to our architecture diagram and the section focusing on database files:

Exploring Oracle Real Application Clusters

Now that we've fully explored the Oracle database architecture, a question might be raised. "Why make a distinction between the instance and the database at all? Don't they both operate on the same system?" By way of answer, yes; a vast majority of Oracle systems run on a single server. The background processes use a single set of CPUs and the caches run within a single set of memory chips. The database files exist on either storage that is within the machine or connected to it. Both the instance and database run on a single system. However, the Oracle RDBMS can be run in a different architecture—one in which the instance and database are physically independent.

Let's imagine an example for a moment. Let's say that we are DBAs for a small but growing company and we're responsible for the backend database that runs a customer billing system. We've chosen a small-sized server platform with two dualcore CPUs and 8 GB of RAM. This single machine connects to a small SAN (Storage Area Network) for the purpose of disk and power redundancy. After two years of running our system, we begin to notice performance problems at peak business hours. Upon further inspection, it becomes clear that the problem is the machine itself. When the database is the most active, the server resources are peaking at 100 percent utilization.

We address this by purchasing and configuring a new database server—one with more CPU and RAM. This works well for a few more years, but eventually the database usage of our growing company has once again overtaxed our server resources. We must then purchase another server, and so on. Our core problem in this case is scalability—our database architecture does not scale to meet the needs of our company. In addition to this, as our billing database becomes more and more important, it's clear that it exists on a single point of failure—the server itself. If the server loses power or malfunctions, there is no way for customers to access the database. Since our business requires 24 x 7 uptime, there is also an availability problem. What we really need is a scalable, high-availability solution. Oracle provides just such a solution (at an additional cost) in the form of Oracle Real Application Clusters, or RAC.

Oracle RAC first began as a product known as Oracle Parallel Server, or OPS. OPS was Oracle's first attempt at providing a clustered database solution. OPS was different from the single instance database architecture we've discussed. In OPS, multiple servers were connected to a single shared source of disk. These servers each operated with their own instance—their own background processes and memory caches. However, all of these machines shared the same, single source of database files. Thus, OPS systems ran multiple instances, but one database. With OPS, Oracle had created a system with both high scalability and high availability. It was scalable, since, if we possessed a two-machine system that was taxed in terms of server resources, we could simply add another server to the cluster. It was highly available as well, since in the event of the loss of a server, the connections to access database data could simply route through another instance. With the release of Oracle 9i, some significant, performance-oriented architecture changes were made, and Oracle Parallel Server was renamed as Oracle Real Application Clusters, or RAC.

In Oracle RAC, servers are connected to a single source of disk. This is usually a Storage Area Network (SAN) or Network Attached Storage (NAS). The data on the SAN or NAS can be "seen" by every server, referred to as a node, in the cluster. Each node has its own instance and can process user requests independently from the other nodes. However, the nodes in the cluster can also share information between instances. Each node is also connected to every other node by way of a high speed network switch or interconnect. In the event that one node has a particular block of data in the SGA that is needed by one of the other nodes, it doesn't have to write that block back to disk first. Instead, it can be shared directly across the interconnect to the node that needs it. We might diagram an RAC system as shown in the following screenshot:

Thus, with RAC, there is an important distinction between the instance and the database in Oracle.


In this article, we've introduced a lot of terms and concepts to explain the internal architecture of the Oracle RDBMS. Throughout the course of this book, we will continue to revisit these concepts and expand on them as we go, in an iterative fashion. We've examined the components of the Oracle instance, the background processes, and the memory caches. We've explored each of these components in greater detail, such as looking at each of the aspects of the caches. We've connected this to the other side of the Oracle RDBMS—the database. We've examined each of the core datafiles that make up the database. In the process, we've seen how both the instance and database work together to form the Oracle architecture. We've also examined Oracle RAC, an alternative to the single-instance architecture that offers scalability and high availability.

Resources for Article :

Further resources on this subject:

You've been reading an excerpt of:

OCA Oracle Database 11g: Database Administration I: A Real-World Certification Guide

Explore Title