Oracle 11g Streams Implementer's Guide

By Ann L. R. McKinnell , Eric Yen
  • Instant online access to over 7,500+ books and videos
  • Constantly updated with 100+ new titles each month
  • Breadth and depth in over 1,000+ technologies
  1. All the Pieces: The Parts of an Oracle 11g Streams Environment

About this book

From smaller businesses through to huge enterprises – users all over the world often require access to data 24 hours a day. Distributed database systems proliferate the world of data sharing providing an ability to access real-time data anywhere, anytime. Oracle Streams, a built-in feature of the Oracle database, is a data replication and integration feature critical to the success and wellbeing of enterprises in today's fast moving economy. This book provides the reader with solid techniques to master Oracle Streams technology and successfully deploy distributed database systems.

This book quickly goes over the basics and gets you up and running with a simple Oracle 11g Streams environment. It will serve as an excellent companion to the Oracle Streams Administration Guide. It is intended for Oracle database architects and administrators, and provides in-depth discussion on must-know information for the design, implementation, and maintenance of an Oracle Streams environment.

The book does not attempt to regurgitate all the information in the Oracle Streams Administration Guides, but rather provides additional clarification and explanation of design, implementation, and troubleshooting concepts that are often elusive in Streams documentation. It also identifies helpful tools and Oracle resources to add to your knowledge base, as well as tried and tested tricks and tips to help you tame Oracle Streams.

The book starts by introducing and explaining the components of Oracle Streams and how they work together. It then moves on logically, helping you to determine your distributed environment requirements and design your Streams implementation to meet those requirements. Once these concepts are discussed, the book moves to configuration and basic implementation examples to help solidify those concepts. It then addresses advanced features such as tags, down-stream capture, and conflict resolution. You then move on to maintenance techniques such as documenting the environment, effectively planning and implementing changes to the environment, and monitoring and troubleshooting the environment. When you have studied the techniques and completed the hands-on examples, you will have an understanding of Oracle Streams' core concepts and functionally that will allow you to successfully design, implement, and maintain an Oracle Streamed environment.

Publication date:
December 2009
Publisher
Packt
Pages
352
ISBN
9781847199706

 

Chapter 1. All the Pieces: The Parts of an Oracle 11g Streams Environment

Oracle Streams provides a flexible data-distribution architecture, founded on Advanced Queuing. This architecture allows users to not only distribute data and changes between Oracle databases, but also non-Oracle databases. The architecture supports flexibility far beyond that of Oracle's Advanced Replication allowing users to direct, manipulate, and transform data changes as they flow through the distributed environment.

In the Preface of this book, we have discussed the concepts of replication and distributed systems and why they are used. Often, Streams is used to replicate data between different business units. One example of this is using Streams to replicate data from local offices to headquarters, or vice versa, allowing the local creation of reports at the destination or target database.

Streams is built into the database and is part of Oracle Enterprise Edition. As such, Streams has tighter integration to the database than any other third party product. Streams can be configured between single instance databases or used with Oracle Real Application Cluster.

This chapter provides a high-level review of:

  • Streams architecture overview

  • Capture process

  • Instantiation

  • Propagation process

  • Apply process

  • SCN co-ordination

  • Logical Change Records (LCRs)

  • Memory and storage architecture

  • Data (DML) Change Auditing via Streams Change_Tables

  • A brief word on XSTREAMS

The information presented in this chapter provides a quick overview of Oracle Streams components. Each component is introduced and some detail is provided. In subsequent chapters, we review Streams design considerations and database configuration, then move on to setting up our first Oracle Streams environment. For more detailed information on these components, please refer to the Oracle 11g Streams Concepts and Administration Guide. Throughout the chapter, we also provide references to other Oracle documentation that is beneficial in gaining a detailed understanding of the component discussed.

Streams architecture overview

Let's take a moment to briefly run through the replication topologies and process flow, and identify the Oracle functional components that are used by Streams.

Topology configurations

Distributed topology configurations are as limited as rocks in a river. However, many are not conducive to an efficient and maintainable system. The number one consideration when choosing a distributed topology is data ownership and conflict resolution requirements as discussed in Chapter 2, Plot Your Course: Design Considerations. To keep your Streams from becoming wild, untamed rivers, and drowning the DBA's, keep master sites to a minimum, if at all possible, and data flows in a tightly controlled and synchronized manner. Oracle recommends no more than four masters involved in full-on all-site N-way replication, and the authors second that recommendation with gusto.

In the Preface, we briefly described single-source, multiple-source, hybrid, and heterogeneous configurations.

The following images provide a synopsis of succinct, controlled single-source and multiple-source configuration examples. Of course these are not the only viable configurations, but they will help you to start developing a feel of how to control your Streams.

Keep in mind that the relationship between source and target (secondary) databases assumes they share the same data at some level. Two databases that handle totally different data would not be considered a source or secondary site to each other.

Single source

In a single-source configuration there is only one database that is the source of data changes that are being Streamed to other database site(s). At the other site(s)/target(s), the data can be applied and/or forwarded to another database. If data changes are forwarded from a destination database, the configuration is called a directed network. There are two types of directed network forwarding configurations; Queue forwarding and Apply forwarding. Queue forwarding involves propagating the change to a target database site but not applying the change at the database. Rather, the destination only forwards the change to a subsequent site to be applied down the line. Apply forwarding will apply the change, and forward it to subsequent destinations via local Capture and Propagation. Destination sites configured as Queue or Apply forwarding sites are often referred to as intermediate sites. Another single source configuration consists of a copy of the actual redo logs being forwarded from the source database, to a "downstream" target database. The actual Capture process and queue are configured on the downstream database rather than on the source database. This configuration is called Downstream Capture which is explained in more detail later on this chapter and in Chapter 6, Get Fancy with Streams Advanced Configurations.

In a single source environment, steps should be taken to avoid changes being made at the secondary destination databases to avoid data divergence and change conflicts.

Some illustrated examples of single-source configurations with a brief explanation of where the Streams processes are located are shown as follows.

Single database

A single database configuration hosts both the Capture and Apply processes on the same database. This can negate the need for a Propagation process as the Apply process can be created to dequeue from the same buffered queue into which the Capture process enqueues. However, there may be circumstances where you wish to configure separate buffered capture queue and persistent apply queues. In this case you would also configure a Propagation process between the two queues. The Apply process can be assigned an apply handler that processes the LCRs in a specific manner. This type of configuration can be used to support client application access to captured LCR data and heterogeneous configurations.

Uni-directional

In single-source to single-target configuration the Capture and Propagate processes and the Capture queue are located at the Source database. The Apply process and queue resides on the Target.

Cascaded/directed network

In a directed network configuration, Capture and Propagation processes and Capture queue reside on the Source. For Queue forwarding, the forwarding site has a queue, but no Apply process. For Apply forwarding, the forwarding site is configured with an Apply process and queue as well as a local Capture process and queue. Tags (discussed in Chapter 6) are used to coordinate the local "recapture" of the applied changes. Appropriate Propagation processes are configured from the forwarding site Capture queue to the destination sites. The final destination site is configured with a regular Apply process and queue.

Hub-and-Spoke

In single-source Hub-and-Spoke configuration, data is Streamed from one source to multiple destinations (targets). This is often seen in "Headquarter to Branch (or Regional)" configuration. With this type of configuration, there is a Capture process and queue at the source as well as a Propagation process to each destination. An Apply process and queue are configured on each of the destinations.

Multiple source

In a multiple source Streams configuration, shared data can be changed at two or more sites. A situation can arise where there is conflict caused by DML/DDL originating from one or more databases acting on the exact same data at nearly the same time. To overcome this conflict situation, conflict resolution must be implemented to determine which data changes should be kept and which should be discarded.

Below are some illustrated examples of multiple-source configurations.

Bi-directional/N-way/Master-to-Master

Bi-directional, N-way, Master-to-Master are all names for essentially the same thing. This configuration allows data changes to be made at all master sites and replicated to all other master sites. As such, Capture, Propagation, and Apply processes and queues must be configured at each master site. For the Capture processes at each site, you can choose to configure a single Capture process and queue for all Propagation processes, or a separate Capture process and queue for each Propagation process.

Uni-directional Spokes-to-Hub

In this configuration, the SPOKES are the origination point of the data change and the HUB is the destination. Capture and Propagation processes and Capture queue are configured on each of the SPOKE sights. An Apply process and queue for each SPOKE site is configured on the HUB. Conflict resolution should be configured at the HUB to accommodate potential data change conflicts from multiple sources.

Bi-directional Spoke-to-Hub

This configuration is just an extension of uni-directional Spoke-to-Hub that allows the HUB to send its changes to each spoke. This means that at least one Capture process and queue must be configured on the HUB, and a Propagation process configured to each SPOKE. Note here that the HUB processes should be configured so that the HUB does not send the same change back to the SPOKE that originated it. This can be accomplished in a number of ways by using tags, and/or conditions (covered in Chapter 6). In addition, an Apply process and queue must be configured at each SPOKE to receive and process changes from the HUB, as well as the Capture process and queue and Propagation process to the HUB that we use in uni-directional Spoke-to-Hub.

Hybrid

A Hybrid configuration is simply a combination of single and/or multiple- source configurations. For instance, one leg of your topology could be a directed network, while another leg could be a master-to-master. The trick is all in where you put your Capture, Propagate, and Apply processes.

Heterogeneous

Heterogeneous configurations include a source or target database that is not an Oracle database. Oracle Heterogeneous Gateways can be used to allow an Oracle database to send and receive changes to and from these non-Oracle sources. The gateways provide a "translation" level that converts Oracle SQL syntax to and from non-Oracle SQL syntax. This allows the users to transparently accomplish equivalent SQL operations from an Oracle database to a non-Oracle database. Oracle 11gR2 delivers the following Gateways to non-Oracle database platforms:

  • Adabas

  • APPC

  • DRDA

  • IMS

  • Informix

  • ODBC

  • SQL Server

  • Sybase

  • Teradata

  • VSAM

In a heterogeneous environment, the Apply process and queue must still exist in an Oracle database and be configured to use a database link to the non-Oracle database. The source database may or may not be an Oracle database. It could be such that the Oracle database is merely an intermediate database that is a directed network where a client application enqueues LCR constructed from data at a non-Oracle database into the Apply queue. Regardless of how the LCR is enqueued in the Apply queue, the Apply process on the Oracle database uses Heterogeneous Services and Transparent Gateway to apply LCR changes directly to database objects in a non-Oracle database through the database link to the non-Oracle database. In other words, the Apply process unpacks the LCR and constructs the necessary DML statement on the Oracle side. It then executes the statement through the gateway database link, using Heterogeneous services to translate the SQL to the proper non-Oracle SQL syntax. Capture and Propagation are configured at the source database. If the Apply process for the non-Oracle database is configured on the source database, Propagation between the Capture and Apply would not be required. A remote Oracle destination database can also be configured to apply the change to a non-Oracle database.

As mentioned above, data can also be sent to an Oracle database from a non-Oracle source. This can be done with Advanced queuing and XSTREAMS or JMS. Again the Apply queue and process are on the Oracle database. The non-Oracle database interface must be configured to enqueue a message to the apply queue on the Oracle database. That message is then dequeued and processed by an Oracle advanced queue process.

Important: Apply processes for a non-Oracle database can only apply DML, not DDL.

Note

For more information on Streams Configurations, please reference the Oracle Streams Extended Examples manual.

Simultaneous versus Synchronous replication

Synchronous replication in a distributed environment means that a change must be successfully committed at the source and all destination sites, or it is not committed at any site, including the source site.

As mentioned in the Preface, Streams is, by nature, an asynchronous replication. The pure fact that the change must be committed at the source site before it is even sent to other sites, means Streams is not a synchronous method of replication.

Even if you use Synchronous Capture, it is still not synchronous replication. The "synchronous" in Synchronous Capture refers to the enqueueing of the change to the Capture queue when it is committed to the source data, rather than having LogMiner mine the redo logs, find the change, and enqueue it. This does not mean that the change is successfully committed to the intended destination database data.

Can Streams be simultaneous (or near-simultaneous depending on data transfer and processing rates)? Yes, the Synchronous Capture, and the combined Capture and Apply (new in 11g and discussed later in this chapter) support simultaneous replication (though they cannot be used together). They reduce the mining, enqueueing, and dequeueing work required by normal implicit Streams. Just remember; we may be able to get the change to the other sites very quickly, but we cannot guarantee 100 percent that the change will be committed at the destination.

The moral of the story is: Streams replication, as it is today, can be "simultaneous", but it can never be "synchronous".

Oracle's Streams replication process flow

  • A change is captured from a database redo stream via LogMiner, or simultaneous Capture mechanisms

  • Any defined capture rules/transformations are applied to the change

  • The Captured Change is molded into a Logical Change Record (LCR)

  • The LCR is stored as a message in a specialized advanced queue to be sent to the target site

  • The propagation job for the target site consumes the message, applies any defined propagation rules/transformations to the message, and sends it to a specialized advanced queue at the target site

  • Oracle's advanced queuing guaranteed, fail-safe Propagation protocol ensures receipt of the message and coordinates the success/error result and handling of the advanced queue messaging

  • The Apply process at the target site consumes the message from the advanced queue

  • Any defined Apply rules/transformations are applied to the change

  • The Apply process then attempts to apply the change to the target site

  • All LCR transactions are validated at the target database by conflict detection to ensure the data is consistent between the source and target databases prior to applying the change

  • When data inconsistencies are found by conflict detection:

    • If conflict resolution is defined, it is applied to the LCR data inconsistency

    • If conflict resolution is not defined, or fails to resolve the LCR data inconsistency, the LCR is not applied at the target but retained in the erred transaction queue for manual resolution

Streams components

The following Oracle components are used to support the Streams process flow:

  • Log Miner: Captures the changes at the originating site.

  • Advanced Queuing: Used to support transporting changes between sites.

  • Capture, Propagate, Apply database processes: Persistent database processes that accomplish the Capture, Propagation, and Apply tasks.

  • Capture, Propagate, Apply rules/transformation via PL/SQL: PL/SQL blocks that define how data should be manipulated by the various processes.

  • Logical change record types: Specialized record types used by Streams to store and manage database change message payloads.

  • Database links/Oracle Net: Provides an operating system independent connectivity between database sites.

  • User authentication/authorization: Provides security access at the database connection and object levels.

  • Guaranteed fail-safe propagation protocol: This ensures that a message is successfully delivered and enqueued at the destination site. If an error occurs, the propagation schedule is marked with an error at the originating site for manual resolution and the message is retained in the Capture queue until it can be propagated.

  • Conflict detection: Internal protocol that determines if the record to which the change is to be applied matches the record at the originating site before the change was made. This supports data change synchronization.

  • Conflict resolution via PL/SQL: Supplied or user defined PL/SQL blocks used to resolve data conflicts found via conflict detection.

 

Streams architecture overview


Let's take a moment to briefly run through the replication topologies and process flow, and identify the Oracle functional components that are used by Streams.

Topology configurations

Distributed topology configurations are as limited as rocks in a river. However, many are not conducive to an efficient and maintainable system. The number one consideration when choosing a distributed topology is data ownership and conflict resolution requirements as discussed in Chapter 2, Plot Your Course: Design Considerations. To keep your Streams from becoming wild, untamed rivers, and drowning the DBA's, keep master sites to a minimum, if at all possible, and data flows in a tightly controlled and synchronized manner. Oracle recommends no more than four masters involved in full-on all-site N-way replication, and the authors second that recommendation with gusto.

In the Preface, we briefly described single-source, multiple-source, hybrid, and heterogeneous configurations.

The following images provide a synopsis of succinct, controlled single-source and multiple-source configuration examples. Of course these are not the only viable configurations, but they will help you to start developing a feel of how to control your Streams.

Keep in mind that the relationship between source and target (secondary) databases assumes they share the same data at some level. Two databases that handle totally different data would not be considered a source or secondary site to each other.

Single source

In a single-source configuration there is only one database that is the source of data changes that are being Streamed to other database site(s). At the other site(s)/target(s), the data can be applied and/or forwarded to another database. If data changes are forwarded from a destination database, the configuration is called a directed network. There are two types of directed network forwarding configurations; Queue forwarding and Apply forwarding. Queue forwarding involves propagating the change to a target database site but not applying the change at the database. Rather, the destination only forwards the change to a subsequent site to be applied down the line. Apply forwarding will apply the change, and forward it to subsequent destinations via local Capture and Propagation. Destination sites configured as Queue or Apply forwarding sites are often referred to as intermediate sites. Another single source configuration consists of a copy of the actual redo logs being forwarded from the source database, to a "downstream" target database. The actual Capture process and queue are configured on the downstream database rather than on the source database. This configuration is called Downstream Capture which is explained in more detail later on this chapter and in Chapter 6, Get Fancy with Streams Advanced Configurations.

In a single source environment, steps should be taken to avoid changes being made at the secondary destination databases to avoid data divergence and change conflicts.

Some illustrated examples of single-source configurations with a brief explanation of where the Streams processes are located are shown as follows.

Single database

A single database configuration hosts both the Capture and Apply processes on the same database. This can negate the need for a Propagation process as the Apply process can be created to dequeue from the same buffered queue into which the Capture process enqueues. However, there may be circumstances where you wish to configure separate buffered capture queue and persistent apply queues. In this case you would also configure a Propagation process between the two queues. The Apply process can be assigned an apply handler that processes the LCRs in a specific manner. This type of configuration can be used to support client application access to captured LCR data and heterogeneous configurations.

Uni-directional

In single-source to single-target configuration the Capture and Propagate processes and the Capture queue are located at the Source database. The Apply process and queue resides on the Target.

Cascaded/directed network

In a directed network configuration, Capture and Propagation processes and Capture queue reside on the Source. For Queue forwarding, the forwarding site has a queue, but no Apply process. For Apply forwarding, the forwarding site is configured with an Apply process and queue as well as a local Capture process and queue. Tags (discussed in Chapter 6) are used to coordinate the local "recapture" of the applied changes. Appropriate Propagation processes are configured from the forwarding site Capture queue to the destination sites. The final destination site is configured with a regular Apply process and queue.

Hub-and-Spoke

In single-source Hub-and-Spoke configuration, data is Streamed from one source to multiple destinations (targets). This is often seen in "Headquarter to Branch (or Regional)" configuration. With this type of configuration, there is a Capture process and queue at the source as well as a Propagation process to each destination. An Apply process and queue are configured on each of the destinations.

Multiple source

In a multiple source Streams configuration, shared data can be changed at two or more sites. A situation can arise where there is conflict caused by DML/DDL originating from one or more databases acting on the exact same data at nearly the same time. To overcome this conflict situation, conflict resolution must be implemented to determine which data changes should be kept and which should be discarded.

Below are some illustrated examples of multiple-source configurations.

Bi-directional/N-way/Master-to-Master

Bi-directional, N-way, Master-to-Master are all names for essentially the same thing. This configuration allows data changes to be made at all master sites and replicated to all other master sites. As such, Capture, Propagation, and Apply processes and queues must be configured at each master site. For the Capture processes at each site, you can choose to configure a single Capture process and queue for all Propagation processes, or a separate Capture process and queue for each Propagation process.

Uni-directional Spokes-to-Hub

In this configuration, the SPOKES are the origination point of the data change and the HUB is the destination. Capture and Propagation processes and Capture queue are configured on each of the SPOKE sights. An Apply process and queue for each SPOKE site is configured on the HUB. Conflict resolution should be configured at the HUB to accommodate potential data change conflicts from multiple sources.

Bi-directional Spoke-to-Hub

This configuration is just an extension of uni-directional Spoke-to-Hub that allows the HUB to send its changes to each spoke. This means that at least one Capture process and queue must be configured on the HUB, and a Propagation process configured to each SPOKE. Note here that the HUB processes should be configured so that the HUB does not send the same change back to the SPOKE that originated it. This can be accomplished in a number of ways by using tags, and/or conditions (covered in Chapter 6). In addition, an Apply process and queue must be configured at each SPOKE to receive and process changes from the HUB, as well as the Capture process and queue and Propagation process to the HUB that we use in uni-directional Spoke-to-Hub.

Hybrid

A Hybrid configuration is simply a combination of single and/or multiple- source configurations. For instance, one leg of your topology could be a directed network, while another leg could be a master-to-master. The trick is all in where you put your Capture, Propagate, and Apply processes.

Heterogeneous

Heterogeneous configurations include a source or target database that is not an Oracle database. Oracle Heterogeneous Gateways can be used to allow an Oracle database to send and receive changes to and from these non-Oracle sources. The gateways provide a "translation" level that converts Oracle SQL syntax to and from non-Oracle SQL syntax. This allows the users to transparently accomplish equivalent SQL operations from an Oracle database to a non-Oracle database. Oracle 11gR2 delivers the following Gateways to non-Oracle database platforms:

  • Adabas

  • APPC

  • DRDA

  • IMS

  • Informix

  • ODBC

  • SQL Server

  • Sybase

  • Teradata

  • VSAM

In a heterogeneous environment, the Apply process and queue must still exist in an Oracle database and be configured to use a database link to the non-Oracle database. The source database may or may not be an Oracle database. It could be such that the Oracle database is merely an intermediate database that is a directed network where a client application enqueues LCR constructed from data at a non-Oracle database into the Apply queue. Regardless of how the LCR is enqueued in the Apply queue, the Apply process on the Oracle database uses Heterogeneous Services and Transparent Gateway to apply LCR changes directly to database objects in a non-Oracle database through the database link to the non-Oracle database. In other words, the Apply process unpacks the LCR and constructs the necessary DML statement on the Oracle side. It then executes the statement through the gateway database link, using Heterogeneous services to translate the SQL to the proper non-Oracle SQL syntax. Capture and Propagation are configured at the source database. If the Apply process for the non-Oracle database is configured on the source database, Propagation between the Capture and Apply would not be required. A remote Oracle destination database can also be configured to apply the change to a non-Oracle database.

As mentioned above, data can also be sent to an Oracle database from a non-Oracle source. This can be done with Advanced queuing and XSTREAMS or JMS. Again the Apply queue and process are on the Oracle database. The non-Oracle database interface must be configured to enqueue a message to the apply queue on the Oracle database. That message is then dequeued and processed by an Oracle advanced queue process.

Important: Apply processes for a non-Oracle database can only apply DML, not DDL.

Note

For more information on Streams Configurations, please reference the Oracle Streams Extended Examples manual.

Simultaneous versus Synchronous replication

Synchronous replication in a distributed environment means that a change must be successfully committed at the source and all destination sites, or it is not committed at any site, including the source site.

As mentioned in the Preface, Streams is, by nature, an asynchronous replication. The pure fact that the change must be committed at the source site before it is even sent to other sites, means Streams is not a synchronous method of replication.

Even if you use Synchronous Capture, it is still not synchronous replication. The "synchronous" in Synchronous Capture refers to the enqueueing of the change to the Capture queue when it is committed to the source data, rather than having LogMiner mine the redo logs, find the change, and enqueue it. This does not mean that the change is successfully committed to the intended destination database data.

Can Streams be simultaneous (or near-simultaneous depending on data transfer and processing rates)? Yes, the Synchronous Capture, and the combined Capture and Apply (new in 11g and discussed later in this chapter) support simultaneous replication (though they cannot be used together). They reduce the mining, enqueueing, and dequeueing work required by normal implicit Streams. Just remember; we may be able to get the change to the other sites very quickly, but we cannot guarantee 100 percent that the change will be committed at the destination.

The moral of the story is: Streams replication, as it is today, can be "simultaneous", but it can never be "synchronous".

Oracle's Streams replication process flow

  • A change is captured from a database redo stream via LogMiner, or simultaneous Capture mechanisms

  • Any defined capture rules/transformations are applied to the change

  • The Captured Change is molded into a Logical Change Record (LCR)

  • The LCR is stored as a message in a specialized advanced queue to be sent to the target site

  • The propagation job for the target site consumes the message, applies any defined propagation rules/transformations to the message, and sends it to a specialized advanced queue at the target site

  • Oracle's advanced queuing guaranteed, fail-safe Propagation protocol ensures receipt of the message and coordinates the success/error result and handling of the advanced queue messaging

  • The Apply process at the target site consumes the message from the advanced queue

  • Any defined Apply rules/transformations are applied to the change

  • The Apply process then attempts to apply the change to the target site

  • All LCR transactions are validated at the target database by conflict detection to ensure the data is consistent between the source and target databases prior to applying the change

  • When data inconsistencies are found by conflict detection:

    • If conflict resolution is defined, it is applied to the LCR data inconsistency

    • If conflict resolution is not defined, or fails to resolve the LCR data inconsistency, the LCR is not applied at the target but retained in the erred transaction queue for manual resolution

Streams components

The following Oracle components are used to support the Streams process flow:

  • Log Miner: Captures the changes at the originating site.

  • Advanced Queuing: Used to support transporting changes between sites.

  • Capture, Propagate, Apply database processes: Persistent database processes that accomplish the Capture, Propagation, and Apply tasks.

  • Capture, Propagate, Apply rules/transformation via PL/SQL: PL/SQL blocks that define how data should be manipulated by the various processes.

  • Logical change record types: Specialized record types used by Streams to store and manage database change message payloads.

  • Database links/Oracle Net: Provides an operating system independent connectivity between database sites.

  • User authentication/authorization: Provides security access at the database connection and object levels.

  • Guaranteed fail-safe propagation protocol: This ensures that a message is successfully delivered and enqueued at the destination site. If an error occurs, the propagation schedule is marked with an error at the originating site for manual resolution and the message is retained in the Capture queue until it can be propagated.

  • Conflict detection: Internal protocol that determines if the record to which the change is to be applied matches the record at the originating site before the change was made. This supports data change synchronization.

  • Conflict resolution via PL/SQL: Supplied or user defined PL/SQL blocks used to resolve data conflicts found via conflict detection.

 

About those Queues


Throughout our discussion on the Streams processes, we mention the Advanced Queues used by Streams to transport changes. These queues are either in-memory (buffered queues) or tables on disk (persistent queues). Oracle Streams uses both buffered queues and persistent queues. A buffered queue can only be an ANYDATA queue, while a persistent queue can be an ANYDATA or a TYPED queue.

ANYDATA and TYPED refer to the payload datatype of the message handled by the queue. An ANYDATA queue's payload is of the SYS.ANYDATA datatype. A TYPED queue has a specific datatype (such as Varchar2, CLOB, BLOB, Number). To determine the payload type of a queue, query the OBJECT_TYPE column of the DBA_QUEUE_TABLES view.

select owner, queue_table, object_type from dba_queue_tables;

The Oracle memory segment used by buffered queues is part of the STREAMS_POOL in the SGA. The type of queue used by Streams depends on the type of LCR that is being stored. Captured and buffered LCRs are stored in buffered queues. Persistent LCRs are stored in persistent queues.

For more information on Streams Queues, review the Introduction to Message Staging and Propagation section of the Oracle Streams Concepts and Administration user's manual.

It is always helpful to understand the whole picture and the pieces that make up the picture. As such, we start with the image as follows:

We use the image above as a reference in this chapter to explain the following processes:

  • Capture

  • Instantiation (Not in image above)

  • Propagate

  • Apply

So, let us start covering each of the main processes and components and it's role in the Streams environment.

 

Capture process what are we supposed to stream?


The Capture process uses both LogMiner and Advanced Queuing to accomplish it's task (Note: Synchronous Capture uses internal triggers instead of LogMiner). The Capture process uses a LogMiner process to examine the database redo log for changes. A Capture process references a set of user-defined rules that determines exactly what needs to be captured for the Stream. These Capture rules identify specific changes to be captured from the redo logs.

These changes are then formatted into Logical Change Records (LCRs) and placed (enqueued) into an advanced queue. In most cases, the queue is a buffered queue (more about LCRs and buffered and persistent queues a little later). This method of capture enqueueing is called "Implicit Capture" and is most often used in a Streams environment. The following image shows the process:

The other method of capturing involves user generation and enqueuing of a message directly into a buffered or persistent queue. This method is called "Explicit Capture" and is usually done by application software. These explicit messages can be either a user defined message or an LCR. For a more detailed explanation on Explicit Capture, refer to the Oracle Streams Concepts and Administration Guide.

A Capture process can capture a majority of database transactions. The Capture process specifically captures DML and DDL. The Streams Capture process can capture DML on columns of the following datatypes:

VARCHAR2

NVARCHAR2

FLOAT

NUMBER

LONG

DATE

BINARY_FLOAT

BINARY_DOUBLE

TIMESTAMP

TIMESTAMP WITH TIME ZONE

TIMESTAMP WITH LOCAL TIME ZONE

INTERVAL YEAR TO MONTH

INTERVAL DAY TO SECOND

RAW

LONG RAW

CHAR

NCHAR

UROWID

CLOB with BASICFILE storage

NCLOB with BASICFILE storage

BLOB with BASICFILE storage

XMLType stored as CLOB

In turn, Capture process can capture the following DDL.

Tables

Indexes

Views

Sequences

Synonyms

PL/SQL packages, procedures, and functions

Triggers

Changes to users or roles

GRANT or REVOKE on users or roles

There are limitations with the Capture process. The following DDL commands are not captured.

ALTER SESSION

ALTER SYSTEM

CALL or EXECUTE for PL/SQL procedures

EXPLAIN PLAN

LOCK TABLE

SET ROLE

NO LOGGING or UNRECOVERABLE operations

FLASHBACK DATABASE

If you take a careful look at the list above, you may notice that these commands are DDL that are instance specific. You want to avoid replicating them, so that you do not end up corrupting the target instance.

In addition, there are object specific DDLs that are not supported by Streams.

CREATE CONTROL FILE

CREATE or ALTER DATABASE

CREATE, ALTER, or DROP MATERIALIZED VIEW LOG

CREATE, ALTER, or DROP MATERIALIZED VIEW

CREATE, ALTER, or DROP SUMMARY

CREATE SCHEMA

CREATE PFILE

CREATE SPFILE

RENAME (Use ALTER TABLE instead.)

Looking at the lists above, one can start to think, "Is there a quick way to tell if my environment can be streamed?" Yes, Oracle Development did provide a quick way to find out. Simply query DBA_STREAMS_UNSUPPORTED view and you can find out the reason why a particular table could not be streamed. We suggest that you query this table as part of your planning a Streams environment.

SELECT * FROM DBA_STREAMS_UNSUPPORTED;

Pay particular attention to the REASON and AUTO_FILTERED column. The REASON column is self-explanatory. As for AUTO_FILTERED, if you see a YES value then Streams automatically filters out the object from being streamed.

Possible reasons include:

Index Organized Table (IOT)

Column with user-defined type

Unsupported column exists

Object table

AQ queue table

Temporary table

Sub object

External table

Materialized view

FILE column exists

Materialized view log

Materialized view container table

Streams unsupported object

Domain index

IOT with overflow

IOT with LOB

IOT with physical Rowid mapping

Mapping table for physical row id of IOT

IOT with LOB

IOT with row movement

Summary container table

The Capture process is the first Streams specific related process. However, if you look again at the diagram you will see LogMiner is also in the picture. The Capture does not do everything by itself. The Capture process uses LogMiner to do all the "heavy lifting". The Capture process takes advantage of LogMiner's ability to mine the database redo logs.

In 9i, the LogMiner tablespace defaulted to the SYSTEM tablespace. As of 10g, it defaults to the SYSAUX tablespace. As there will be additional usage of LogMiner with a Streams environment, we recommend that you isolate the tables related to LogMiner in its own tablespace. This can be accomplished with the following scripts.

CREATE TABLESPACE LOGMNRTS DATAFILE '/u05/oracle/data/logmnrtbs.dbf'
SIZE 100M AUTOEXTEND ON MAXSIZE UNLIMITED;
BEGIN
DBMS_LOGMNR_D.SET_TABLESPACE('LOGMNRTS');
END;

This can help eliminate possible fragmentation in the SYSTEM or SYSAUX tablespace where the LogMiner tables are created by default. Depending on your tablespace file to disk distribution, it can also help with performance. If your database has been upgraded from an earlier version, the LogMiner tablespace may well be set to the SYSTEM tablespace. If it is, you are strongly cautioned to use the above method to reset the LogMiner tablespace to a non-system tablespace.

To actually identify the Capture and LogMiner processes that are running on the source database, look for the background process on the host of CPnn for Capture and MSnn for LogMiner where nn is a combination of letters and numbers. Both of these processes may not be constantly running, so they should be monitored over time. Also, there may be multiple Capture and/or LogMiner processes running.

Downstream Capture

The Capture process usually resides on the Source database. This configuration is called Local Capture (and sometimes Upstream Capture). The Source database is defined as containing both the Capture process and the tables being captured. There is another Capture configuration that can be used called Downstream Capture. For now, we will just give a quick example of when and why a Downstream Capture would be configured.

The Capture process consumes resources (memory and CPU) from the host. This may not be optimal in a high-volume production environment (this is but one case where Downstream Capture comes into play). Downstream Capture allows the Capture process and queue to be moved to another staging node. That staging node is the "worker" that can afford the additional overhead of Capture. Downstream Capture uses standby archived log destinations (just like those used by Data Guard) defined at the source database to direct a copy of the redo to the staging node. The Capture process at the staging node then mines those redo copies and enqueues the necessary LCRs. Propagation processes on the staging node then send the LCRs to the appropriate destination database sites. We will cover Downstream Capture and other advanced configurations in more detail in Chapter 6.

Synchronous Capture

Synchronous Capture is not Synchronous replication. We need to be clear on this.

Where regular Implicit Capture depends on LogMiner to extract data changes from the redo, Synchronous Capture actually enqueues the data change to its Capture queue directly when the change is committed at the source.

Synchronous Capture (SC) does have some limitations and differences from Implicit Capture. They are as follows:

  • SC can only be created at the Table or Subset levels, not the Schema or Global

  • SC cannot be created using the DBMS_STREAM_ADM.MAINTAIN_*_SCRIPTS procedures

  • SC uses a persistent queue (queue data is stored on disk), so it requires a slightly different configuration than normal Implicit Capture

  • SC only captures DML, no DDL

  • SC does not capture changes for the following datatypes:

    • LONG

    • LONG RAW

    • CLOB

    • NCLOB

    • BLOB

    • BFILE

    • ROWID

    • User-defined types (including object types, REFs, varrays, and nested tables)

    • Oracle-supplied types (including ANY types, XML types, spatial types, and media types)

  • SC can only capture changes for an Index Organized Tables(IOT) if it does not contain any of the above listed datatypes

  • SC is not a valid configuration for Combined Capture and Apply (this requires a buffered (in memory) capture queue)

We will cover Synchronous Capture and other advanced configurations in more detail in Chapter 6.

 

Instantiation


We mention instantiation as part of this chapter to stress its importance. Instantiation refers to the creation of the replicated object at target databases, based on the source object. It also provides Streams with the information needed to determine what transactions to apply at the destination site(s). You can think of it as an agreement (starting point) that needs to be established between the source and destination before any Streaming can be accomplished. The main purpose of instantiation is to prepare the object structure and data at the destination site to receive changes from the source.

Instantiation is a process composed of three steps:

  • Creating the object(s) at the destination Site

  • Updating the Streams data dictionary with metadata

  • Setting the Instantiation SCN for the object(s)

The database objects, which are either tables or other objects, need to exist on both the source and destination site. Table structures between a source and destination database can differ if there is a transformation or subsetting involved, though often the tables will have the same data and the same structure. If there is data that needs to be replicated in the table(s) then the data should be the same at the source and destination sites at the time of instantiation, unless there is some sort of transformation, subsetting, or other apply or error handler put in place to compensate for the data differences. This becomes apparent when an update or delete DML fails due to data NOT being at the destination site(s) or having different values in the replicated columns.

Once instantiation is complete, the instantiation SCN will be the same at both the source and destination site(s), indicating to Streams that it is from this SCN forward that changes should be captured, propagated, and applied for the destination. The following image demonstrates this concept as it shows the instantiated Inventory Table with the same instantiation SCN at both the Source and Destination site.

Instantiation Levels and Methods can be accomplished at different levels depending on your requirements. These instantiation levels include:

Instantiation Levels

Table Level

Schema Level

Database (Global) Level

Tablespace (this requires special steps)

Instantiation Methods

Data Pump

Transportable Tablespaces

RMAN for entire database

Manual method

The possible combinations of Instantiation Levels that can be used with Instantiation Methods can become confusing. So, with a handful of different methods to instantiate tables; How does one decide which method to use? In general, you can use Data Pump to instantiate all (or some) of the tables at Schema Level. You can also decide to move all tables to a particular tablespace and use transportable tablespaces. For now, we will focus on two methods that we use most often due to its ease of use and flexibility.

Using Data Pump to instantiate tables and schemas is fairly straightforward. The Data Pump export utility EXPDP will use Oracle Flashback to ensure that the export is consistent and at the same time capture the instantiation data. For greater control use the FLASHBACK_SCN or FLASHBACK_TIME parameters. On the import side, use the DataPump import utility IMPDP. If it is a full database import, use the parameter STREAMS_CONFIGURATION=y (the default value) to direct IMPDP to include any Streams related metadata that may be contained in the export.

Note

STREAMS_CONFIGURATION is only relevant for FULL database imports via IMPDP. All the other functionality of Data Pump can also be used. So using Data Pump to export/import the entire database, schema, or specific tables can be accomplished with ease, and is the recommended method of export/import based instantiation as of Oracle 11g.

If the replicated structures and data are the same on both sites, we recommend that you use DataPump Export/Import to instantiate (this can be done via DBMS_STREAMS_ADM.MAINTAIN_* scripts).

If the replicated structures are the same, but the data different between sites, we recommend instantiating objects via DataPump with CONTENT=METADATA_ONLY, and manual calls to necessary DBMS_STREAMS_ADM.ADD_RULE and DBMS_CAPTURE_ADM subprograms.

Note

CONTENT=METADATA_ONLY is not supported with TRANSPORTABLE TABLESPACE mode. Make sure to include handling expected data differences between sites in your Capture and/or Apply processes as necessary (see Chapter 6 for more information on data transformation and conflict resolution techniques that can be useful).

If the replicated structures and data are different between sites, we recommend that you create and populate the objects at each site manually, then call the necessary DBMS_STREAMS_ADM.ADD_RULE and DBMS_CAPTURE_ADM subprograms manually. Make sure to configure transformation rules for the structural differences, and handlers for the data differences. One important thing to remember is that if Capture, Propagation and Apply processes and rules are added, or modified, you will need to re-instantiate the SCN between the source and destination. You can do this by following the manual method.

What sets the instantiation SCN and when?

Any of the DBMS_STREAMS_ADM.MAINTAIN_*_SCRIPTS subprograms will automatically set both the Source and Target instantiation SCNs.

The DataPump or Transportable Tablespace instantiation methods will automatically set the Source and Target instantiation SCNs.

Creating the Capture process via DBMS_STREAMS_ADM.ADD_RULE will automatically set the Source instantiation SCN only. You will need to manually set the Target instantiation SCN using the DBMS_APPLY_ADM.SET_*_INSTANTIATION_SCN (covered in the next section).

Creating the Capture process via DBMS_CAPTURE_ADM.CREATE_CAPTURE will not set any instantiation SCNs. You must manually set the instantiation at both the Source and Target sites.

Setting the instantiation SCN manually using the DBMS_CAPTURE_ADM.PREPARE_*_INSTANTIATION and DBMS_APPLY_ADM SET_*_INSTANTIATION_SCN for the proper instantiation level is simple.

The DBMS_CAPTURE_ADM contains the following packages used to prepare the objects for instantiation at the source:

  • PREPARE_TABLE_INSTANTIATION

  • PREPARE_SCHEMA_INSTANTIATION

  • PREPARE_GLOBAL_INSTANTIATION

The DBMS_APPLY_ADM contains the following packages used to instantiate the object at the destination:

  • SET_TABLE_INSTANTIATION_SCN

  • SET_SCHEMA_INSTANTIATION_SCN

  • SET_GLOBAL_INSTANTIATION_SCN

The steps for setting the instantiation SCN are as follows:

  • Call the appropriate DBMS_CAPTURE_ADM. PREPARE_*_INSTANTIATION package at the source database

  • Determine the current SCN at the source database using DBMS_FLASHBACK.GET_SYSTEM_CHANGE_NUMBER

  • Call the appropriate DBMS_APPLY_ADM SET_*_INSTANTIATION_SCN at the destination database, specifying the SCN value returned in step 2

To state the obvious, you want to make sure that you use the same level for setting the instantiation at the destination that you used to prepare instantiation at the source. Code examples for setting the Instantiation SCN manually are provided in Chapter 3, Prepare the Rafts and Secure Your Gear: The pre-work before configuring Oracle 11g Streams.

The following views can help you determine what instantiation levels have been prepared at the source database:

  • DBA/ALL_CAPTURE_PREPARED_TABLES

  • DBA/ALL_CAPTURE_PREPARED_SCHEMAS

  • DBA/ALL_CAPTURE_PREPARED_DATABASE

 

Propagate process


Now that we know about the Capture process, it is time to move to the Propagate process. The Propagate process does the actual Propagation between the source and target queues.

Propagation has two configuration options, queue-to-queue or queue-to-dblink. In the queue-to-queue configuration, each Propagation has its own propagation job. This allows multiple Propagations to be configured and scheduled to propagate at different times. It should be noted that queue-to-queue propagation is recommended for Streams in an RAC environment. The default configuration of queue-to-dblink has one shared propagation job. For queue-to-dblink configurations, having one shared propagation job may cause issues because making any propagation schedule change affects all the propagations that rely on the source queue and database link. This issue can be overcome by configuring different queues when using queue-to-dblink. For example, one queue would be set up QUEUE1 and be on its own SCHEDULE. Then a separate queue, QUEUE2 can be set up and have its own SCHEDULE. Propagation scheduler will be covered in a moment.

Propagation can be configured to propagate to different targets. With the use of RULES one can set up multiple Propagate processes, referencing one buffer queue on the Source. Each Propagation process will process from the same source queue. The source queue will only delete the LCR once it is consumed by each Propagation process registered as a consumer of the queue.

At this time, we need to mention Secure Queue. A secure queue can be used only by the owner of that secure queue. Think of it this way, the owner of the queue runs and controls it. No other users or processes may use a secure queue unless the owner allow it by granting explicit privileges to the user. For the purpose of this book, all queues will be secure queues. This is compared to an unsecure queue which any session or process is allowed to use.

So, how does Propagate know when to do its job? In Oracle 11g, Scheduler controls when the Propagation takes place (in previous versions, it was accomplished via regular database jobs). Later, when we work through our example Streams configurations, we will accept the default propagation schedule.

"What" gets propagated "Where" is controlled via Propagation rules is just like "What" is captured by the Capture process that is controlled via Capture rules. The creation of these rules is very similar to Capture process rules, so we won't go into the same level of detail again. Propagation process and rules can be created using either the DBMS_STREAMS_ADM.ADD_*_PROPAGATION_RULE for the replication level desired, or via the DBMS_PROPAGATION_ADM.CREATE_PROPAGATION procedure.

The main thing to understand is that Propagation needs to know what queue from which it must dequeue captured LCRs, what database link to use to send the changes and the destination database name, as well as the remote queue at the destination database in which to enqueue the LCR. As with Capture rules, you can control what the Propagation process sends based on the same type of parameters and rule conditions used in the Capture process.

It should be noted that even though you create a Propagation process using either the DBMS_STREAMS_ADM or DBMS_PROPAGATION_ADM procedures, you use the DBMS_AQADM Propagation Subprograms to schedule/unschedule, alter, enable, and disable propagation. You use the DBMS_PROPAGATION_ADM subprograms to create drop, start, and stop propagation jobs. This is because the Propagation process is separate from the propagation job. The Propagation process can remain running while the propagation job is disabled. This separation allows Streams to support queue-to-queue Propagation as well as queue-to-dblink propagation. Understanding the separation of the two, aids in understanding what procedure to use to control which piece of Propagation.

You can see this separation of duties by looking at the background processes for Propagation. The Propagation processes (similar to the Capture processes) are designated by Pnnn, where as the propagation jobs are separate job processes designated by Jnnn.

Note

For more detailed information on creating and managing Propagation processes and schedules, please refer to the Oracle Streams Concepts and Administration Guide and the Oracle PL/SQL Reference and Types manual.

The Network: COMLINK

If you have a job title that ends in "Administrator", such as "DBA", you know one thing for sure and that is that the bottom line performance is dependent on the quality of hardware and network. To a certain point, Administrators hit the hardware or network performance wall and no amount of configuration tweaking will change the performance levels. As a DBA, we (hopefully) have some influence on the hardware selection for our databases. DBAs often have little (or no) input as to the network configuration or network hardware selected.

Why is this important? Streams depends on both hardware and network. If you have slow hardware and/or network, you can expect Streams to have low performance levels. The performance of Streams relates directly to your hardware and/or network limitations.

It is with this in mind that we address how to measure the network performance and its potential impact on Propagation before implementing Streams. In many cases Streams database links are configured to use the aliases in TNSNAMES.ORA. The TNSNAMES.ORA DESCRIPTION format can use different network protocols, though most often we see (PROTOCOL=tcp). Measuring, and knowing the network speed and protocol used between the SOURCE and TARGET nodes is important when diagnosing overall Streams' performance. We cover this in more detail in Chapter 3.

We are not saying "blame the network" when a performance problem occurs with Streams. What we wish to convey is that there are parts of the Streams environment that are in your direct control. There are also parts that are beyond your control that affect Streams' performance. Knowing how the network performs is crucial to your ability to diagnose all Streams' performance. Having a good working relationship with your Network Admin will also help when such problems arise. By working with the Network Admin closely, you may also be able to establish minimum service level agreements as well as set realistic client or user expectations where performance is concerned.

Propagation success/failure

How does a propagation job know that its payload has been successfully received and stored at the destination site? This is accomplished via Oracle's guaranteed fail-safe Propagation protocol that requires return notification of a successful commit of the LCR payload data at the destination before it allows the LCR to be removed from the Capture queue and the Capture REQUIRED_CHECKPOINT_SCN to be moved beyond that LCRs' SCN. If, after a number of tries (16 to be exact) destination enqueue fails, an error is logged in the DBA_QUEUE_SCHEDULES and DBA_PROPAGATION view, the propagation job will be disabled and will require manual restart.

Note

For more information on monitoring the Propagation process and jobs, please refer to the Oracle Streams Concepts and Administration Guide, and the Oracle Streams Replication Administrators' Guide.

Propagation Stream Split and Merge

Not all COMLINKs are created equal (unfortunately). It is quite possible to have a distributed environment where some network connections from a source database to the different destination databases may not move Streamed data at the same rate, or be equally stable. This inequality of transport to multiple destinations can cause the source queue to grow undesirably large as a result of enqueued LCRs that cannot be deleted until the destination site has confirmed receipt and successful enqueue of the LCR. This could result in Spilled transactions (the LCRs are written to disk from the buffered queue), and memory resource consumption. This also creates a negative performance impact as propagation jobs must scan all the entries in the queues to determine which they have and have not sent. In previous releases, the way to circumvent this was to create a separate capture queue that was used for slower moving or unstable destination connections. This allowed jobs using faster and more stable network connections to be assigned to less encumbered queues. 11g brings the ability to have the Propagation process create these types of queue segmentations on the fly, as needed, via Streams Split and Merge. A Propagation process can be configured to recognize when a destination site is exceeding expected transport times. If this happens, the Propagation process will (in a nutshell) clone the Capture process and queue, spawn a separate Propagation job from the cloned queue to the "slow" destination and remove the original Propagation job for the destination from the original queue. In essence, it "splits" the slow stream off to a separate queue allowing the original queue to service the faster destination Propagation processes without performance impact. If/when transport times for the "Split" destination Propagation return to normal, the cloned Capture process can be started to allow the rogue destination site to catch up. Once it does, the queues and processes are merged back to their original configuration automatically. We cover more on Stream Split and Merge in more detail in Chapter 6.

The following lists Propagation/Scheduler views that contain helpful information concerning your Propagation processes and job:

  • DBA_PROPAGATION

  • V$BUFFERED_SUBSCRIBERS

  • V$BUFFERED_PUBLISHERS

  • V$PROPAGATION_RECEIVER

  • V$PROPAGATION_SENDER

  • DBA_SCHEDULER_JOBS

    (filter on JOB_CLASS = 'AQ$_PROPAGATION_JOB_CLASS')

 

Apply process


We are over the hump and it's all downhill from here. From our previous image on Queues, we notice that we are now on the Target side of the Streams environment. On this side, we have a buffered queue and the Apply process. The queue on this side will be a secure queue that is the same kind as that on the Source. This secure queue (on the Target side) contains the LCRs sent over by the Propagation process.

At this point, the Apply process comes into the picture. The Apply process takes LCRs (or messages) from the secure queue and applies them to the Target database object, or hands it off to an Apply handler. An Apply handler is a user defined procedure that processes the LCR change. The user defined procedure takes a single LCR (or messages) as input. As with the Capture and Propagation, the Apply process uses rules to determine what LCR's to Apply.

The Apply process is made up of multiple parts. Those parts are as follows:

  • Reader server: Takes the LCRs and converts it into transactions, preserving transactional order, and dependencies.

  • Coordinator process: Takes the transactions from reader server and sends them to Apply server. This process also monitors the Apply server to ensure that the transactions are applied in the correct order.

  • Apply server: Applies the LCR or message to the handler, either an Apply handler or message handler. Apply server also deals with placing the LCR or message into the appropriate error queue if it cannot be applied.

Keep in mind that there can be multiple Apply processes. The Apply reader and Apply server processes show up as background process on the host as ASnn. In addition, there can also be multiple Coordinator Processes (from above). The Apply coordinator background processes names appear as APnn. In both cases, nn is a number and letter combination (0 9 and a z).

The Apply process itself is pretty straightforward; dequeue the LCR, evaluate the LCR against the Apply rules, if the overall evaluation is true, apply it (if it evaluates to FALSE ignore it), if the Apply fails, put the LCR in the Apply error queue. Where things can get complicated is at conflict detection, resolution, transformations, and user defined Apply handlers. To really get your head around conflict detection and resolution, you need to understand the LCR structure. Therefore, we save this discussion for the LCR section. Additional understanding of conflict resolution, transformations, and Apply handlers requires a strong understanding of Rule structures. Thus, we save these discussions for Chapter 5, N-Way Replication, for configuring conflict resolution and Chapter 6 for rule-based transformations.

Trigger firing and Apply

By default, Triggers do not fire when an Apply processes applies data to a table. This is intended behavior. This keeps changes to replicated tables that result from a trigger at the source site from being duplicated at destination sites. Case in point if we did not take this precaution. Tables A and B are replicated from source to target. Table A has a trigger that updates table B on commit. These two table updates are replicated as both tables A and B are replicated. However, when the change to table A is applied at the destination, it would kick off the trigger to update table B. But, we also have the change to table B that was sent from the source. Either this change will be overwritten by the trigger, or it will fail because the original values of both records, do not match (see our discussion on conflict detection and LCRs later in this chapter). This yields a high potential for data divergence, which is highly undesirable in a replicated environment.

"Well" you say, "What if I need the trigger to fire because I don't replicate table B?". That can be accomplished by setting the trigger's firing property to allow the Apply process to fire the trigger. The trigger firing property default is set to "once", so that it fires once when a normal change is made. However, when the change is accomplished by an Apply process, the trigger will not fire if its firing property is set to "once". The trigger firing property is managed by the DBMS_DDL. SET_TRIGGER_FIRING_PROPERTY procedure. Be careful as the parameter to set the trigger firing to allow the Apply process to fire the trigger is a Boolean and can cause some initial confusion. If the value for FIRE_ONCE is set to TRUE, then Apply will not be able to fire the trigger. If FIRE_ONCE is set to FALSE, then the Apply will be able to fire the trigger.

So, if you want a trigger to fire for applied LCRs you will have a call to DBMS_DDL that looks like this:

sql>exec DBMS_DDL.SET_TRIGGER_FIRING_PROPERTY (
trig_owner => '<schema>',
trig_name => '<trigger_name>',
fire_once => FALSE);

If you are not sure as to what the trigger firing property is set to, you can use the DBMS_DDL.IS_TRIGGER_FIRE_ONCE function. The function will return TRUE if the FIRE_ONCE property is set to TRUE (meaning the Apply process cannot fire the trigger), and FALSE if it is set to FALSE (meaning the Apply process can fire the trigger).

Note

For more information on the Trigger firing property and the Apply process, please refer to the Advanced Apply Process Concepts chapter in the Oracle Streams Concepts, and the Oracle PL/SQL Reference and Types manual.

The following lists Apply views that contain helpful information concerning your Apply processes:

  • DBA_APPLY

  • DBA_APPLY_CONFLICT_COLUMNS

  • DBA_APPLY_DML_HANDLERS

  • DBA_APPLY_ENQUEUE

  • DBA_APPLY_ERROR

  • DBA_APPLY_EXECUTE

  • DBA_APPLY_INSTANTIATED_GLOBAL

  • DBA_APPLY_INSTANTIATED_OBJECTS

  • DBA_APPLY_INSTANTIATED_SCHEMAS

  • DBA_APPLY_KEY_COLUMNS

  • DBA_APPLY_PARAMETERS

  • DBA_APPLY_PROGRESS

  • DBA_APPLY_SPILL_TXN

  • DBA_APPLY_TABLE_COLUMNS

  • DBA_HIST_STREAMS_APPLY_SUM

  • V$STANDBY_APPLY_SNAPSHOT

  • V$STREAMS_APPLY_COORDINATOR

  • V$STREAMS_APPLY_READER

  • V$STREAMS_APPLY_SERVER

 

Combined Capture and Apply


You've seen these movies and heard the famous songs where, when certain planets align in certain ways at certain times, special powerful things automatically happen. As of 11g, Streams has such a cosmic event potential. And when this cosmic event occurs, its called Combined Capture and Apply. Seriously, it really is like a cosmic event in the galaxy of optimization. When Oracle Streams is configured a particular way between two sites, the Capture process acts as the propagator, using its associated Propagation process rule set, and transmits the eligible LCR's directly to the Apply process at the destination via database link. This functionality automatically detects if the optimal configuration is in place and "flips the switch" to enable Combined Capture and Apply. The only way to control whether or not Combined Capture and Apply is enabled, is to change the configuration of your Streams so that one of the configuration "rules" is violated.

The configurations that cultivate this cosmic event are a little different depending on where the Apply process resides.

If the Apply process resides in a different database than the Capture process, the configuration is required:

  • The Capture and Apply databases must be on release 11g Release 1 or higher

  • The Capture process is the only publisher for the capture queue

  • Propagation is configured directly between the capture and apply queues (no intermediate queues allowed)

  • The Propagation is the only consumer for the Capture queue

  • The Propagation is the only publisher for the Apply queue

  • If a buffered Apply queue is used, the Apply process can be the only consumer for the queue

  • If a persistent Apply queue is used, multiple Apply processes can be consumers for the queue

One behavior to point out here is that if the Apply process is unavailable at the destination database, the Capture process will hang in the INITIALIZING state at startup until the Apply process becomes available. Once the Apply process is enabled, the Capture process immediately transitions to CAPTURING CHANGES.

If the Apply process resides in the same database as the Capture process, the configuration is required:

  • The database must be on release 11g Release 1 or higher

  • The Capture and Apply process use the same queue

  • The Capture process is the only publisher for the queue

  • Propagation is configured directly between the capture and apply queues (no intermediate queues allowed)

  • If a buffered queue is used, the Apply process can be the only consumer for the queue

  • If a persistent queue is used, multiple Apply processes can be consumers for the queue

As the user has no control over the enablement of Combined Capture and Apply (CCA) beyond setting up the Streams configuration, it may not be immediately obvious when Combined Capture and Apply is enabled. You can determine if it is enabled by checking the V$STREAMS_CAPTURE and V$STREAMS_APPLY_READER views.

In V$STREAMS_CAPTURE, the APPLY_NAME will have the name of the Apply process and the OPTIMIZATION will be greater than 0 (zero, zed) if CCA is enabled.

select capture_name, apply_name, optimization from V$STREAMS_CAPTURE;
CAPTURE_NAME APPLY_NAME OPTIMIZATION
------------------ ----------------------- ------------
HR_CAPTURE HR_APPLY 2

The PROXY_SID is not NULL in V$STREAMS_APPLY_READER.

select apply_name, proxy_sid from V$STREAMS_APPLY_READER;
APPLY_NAME PROXY_SID
------------------------------ ----------
HR_APPLY 132

You will also see a similar entry in the alert log:

Propagation Sender (CCA) HR_PROPAGATION for Streams Capture HR_CAPTURE and Apply HR_APPLY [on destination STRM2] with pid=28, OS id=6096 started.

When Streams is in Combined Capture and Apply mode, you will not see information concerning the Propagation in the DBA_QUEUE_SCHEDULES view. In this case, you will need to query the V$PROPAGATION_SENDER and V$PROPAGATION_RECEIVER views.

 

SCN Coordination keeps it flowing smoothly


All of the Streams processes use SCNs to keep track of what change transactions they have processed and they share this information to coordinate who gets what, who still needs what, and what can be ignored (because it has already processed). This is why coordinating the Instigation SCN at the start is so important.

Capture and Apply object instantiation are not the only components of Streams that rely on SCN synchronization. The Capture process must also coordinate it's SCNs with the LogMiner process and available archived logs to ensure data integrity over time. This is done via FIRST_SCN, START_SCN and REQUIRED_CHECKPOINT_SCN.

The Capture process relies on a valid LogMiner Data Dictionary to access database object structure for redo capture to build LCRs. This LogMiner Data Dictionary is separate from the Database Data Dictionary, but is a "picture" of the Database Data Dictionary at the time the Capture process is created. Each Capture process either builds a new LogMiner Data Dictionary or accesses an existing one when it first starts. To build a LogMiner Data Dictionary, the Capture process must have access to the "picture" of the Database Data Dictionary from the redo logs at the time of the SCN from which it must first start capturing. This picture of the Database Data Dictionary is created in the redo logs by running the DBMS_CAPTURE_ADM.BUILD procedure. This procedure must be one at least once in the database before a Capture process can be created. The BUILD creates a picture of the Database Data Dictionary in the current redo log and records an entry in the V$ARCHVIED_LOG view indicating that the redo log contains a Data Dictionary information (DICTIONARY_BEGIN='YES') as of the SCN (FIRST_CHANGE#) at the time of the BUILD. The FIRST_SCN of the Capture process must correspond to a FIRST_CHANGE# for a BUILD. For the Capture process to start for the first time, the redo log for that FIRST_CHANGE# must be available to the database instance. The BUILD procedure can be run multiple times, and different Capture processes can use any one of these builds when it is created by specifying one of the FIRST_CHANGE# values for a build for the Capture process FIRST_SCN parameter (as long as the necessary redo logs are available to the instance). The Capture process will access the redo log containing the Dictionary information, and build its LogMiner Data Dictionary if needed. You can find eligible FIRST_SCN values by querying V$ARCHIVED_LOGS for FIRST_CHANGE# values generated by a build.

select distinct NAME, FIRST_CHANGE# from V$ARCHIVED_LOG where DICTIONARY_BEGIN = 'YES';

The NAME column has the name of the redo log(s) in which the BUILD resides. All redo logs from this redo log forward, must be available for the Capture process to first start.

If you specify a FIRST_SCN for a new Capture process from a BUILD for which another Capture process has already built a LogMiner Data Dictionary, the new Capture process will use the existing LogMiner Data Dictionary.

If you do not specify a FIRST_SCN (default is NULL) when creating a Capture process, the creation will call DBMS_CAPTURE_ADM.BUILD procedure, and set the FIRST_SCN for the Capture process to the FIRST_CHANGE# generated by the build. If you create a Capture process using one of the procedures in the DBMS_STREAMS_ADM package, the FIRST_SCN parameter is automatically set to NULL, forcing the capture creation to do a BUILD.

The SCNs of Capture

The following synopsizes the SCNs of a Capture process; how they are used, and rules of usage.

FIRST_SCN

  • The lowest SCN at which the Capture process can begin capturing

  • Must correspond to a FIRST_CHANGE# value in V$ARCHVIED_LOG for a Data Dictionary BUILD in the redo logs

  • Points the Capture process to the redo log(s) that hold the Data Dictionary information from which to build its LogMiner Data Dictionary if necessary, and begin scanning redo for changes on the Capture process first startup

  • If REQUIRED_CHECKPOINT_SCN is 0, the Capture process will begin scanning at FIRST_SCN on subsequent capture startups

  • It can be changed by the following:

    • Manually using DBMS_CAPTURE_ADM.ALTER_CAPTURE

    • Automatically by CHECKPOINT_RETENTION_TIME purge process

  • Change can only be to a value greater than the current FIRST_SCN value

  • FIRST_SCN cannot be greater than REQUIRED_CHECKPOINT_SCN when REQUIRED_CHECKPIONT_SCN is greater than 0

START_SCN

  • The SCN at which the Capture process will actually begin capturing changes on startup

  • START_SCN must be greater than or equal to FIRST_SCN

  • If a Capture process's FIRST_SCN is changed (either manually or via CHECKPOINT_RETENTION_TIME purge process) to a value greater than its START_SCN, the START_SCN is automatically increased to the new FIRST_SCN value

  • START_SCN can be changed manually using DBMS_CAPTURE_ADM.ALTER_CAPTURE

  • START_SCN can be set to a value lower than its current value, as long as the new value is not less than the FIRST_SCN value for the Capture process

  • START_SCN is usually only manually changed if a point-in-time recovery has been performed at a destination site, and the point-in-time recovered to requires changes to be resent to the destination site

Note

If the point-in-time recovery requires an SCN before the Capture process FIRST_SCN, that process cannot be used to send changes to the recovered site. If a Data Dictionary BUILD is available in the archived logs with a FIRST_CHANGE# less than or equal to the SCN required for the point-in-time recovery, a new Capture process can be created specifying the appropriate FIRST_CHANGE# for the FIRST_SCN. Otherwise, the Streamed objects must be re-instantiated from the source at the destination.

REQUIRED_CHECKPOINT_SCN

  • Set to 0 (zero, zed) when the Capture process is created

  • Incremented by the Capture process LogMiner checkpoint process

  • Value determined by the lowest APPLIED_SCN for all destination sites for the Capture process queue

  • The lowest SCN that the Capture process must be able to access from the redo logs to be able to restart

  • The redo log that includes this SCN and all subsequent redo logs must be available to the Capture process database instance, for the Capture process to successfully start

  • If value > 0 (zero, zed), the Capture process starts scanning from this SCN when restarted

  • The REQUIRED_CHECKPOINT_SCN is only changed when a checkpoint is completed. This happens either by:

    • Automated by LogMiner Checkpoint process

    • Manually via command

      DBMS_CAPTURE_ADM.SET_PARAMETER('<capture_name>', '_checkpoint_force','Y')
      

CAPTURED_SCN

The most recent SCN scanned by the Capture process.

APPLIED_SCN

  • The most recent SCN dequeued and processed by any Apply process that receives changes from the Capture processes queue

  • Corresponds with the low-watermark SCN for an Apply process

MAXIMUM_SCN

  • The SCN at which a Capture process must stop capturing changes and disable

  • The Capture process will disable when it reaches this upper limit SCN

  • Changes with and SCN greater than or equal to the MAXIMUM_SCN are not captured by the Capture process

  • If the value is infinite (default), the Capture process captures changes without upper limit

LAST_ENQUEUED_SCN

  • This is the last SCN enqueued by the Capture process

  • This value is dynamic and will increase as the Capture process captures and enqueues LCR

  • Can be used to gauge latency of Propagation and Apply

SOURCE_RESETLOGS_SCN

The SCN at the last RESETLOGS action.

MAX_CHECKPOINT_SCN

The SCN at which the latest checkpoint was taken.

Note

For more detailed information on how FIRST_SCN, START_SCN and REQUIRED_CHECKPOINT_SCN are used by the Capture process, please refer to the The LogMiner Data Dictionary for a Capture Process, Capture Process Checkpoints, and Multiple Capture Processes for the Same Source Database sections in Chapter 2: Oracle Streams Information Capture of the Oracle Streams Concepts and Administration guide 11g.

The SCNs of Propagation

A Propagation process really only tracks one SCN value. This is the ACKED_SCN which is the SCN sent to the Propagation process from the destination for which the Apply process has acknowledged by all Apply queue subscribers as successful dequeued and processed. This means the dequeued LCR was either successfully applied or successfully committed to the Apply error queue. This value is used by the Capture checkpoint to help determine its REQUIRED_CHECKPOINT_SCN.

The SCNs of Apply

The following synopsizes the SCN's of an Apply process; how they are used, and rules of usage.

IGNORE_SCN

  • The SCN below which changes received should not be applied

  • Only set when instantiation is accomplished via Export/Import

  • Corresponds with the SCN set at the source database when the object was prepared for instantiation

  • The instantiation SCN must be equal to or greater than this SCN

MAXIMUM_SCN

  • The SCN at which an Apply process must stop applying changes and disable

  • The Apply process will disable when it reaches this upper limit SCN

  • Changes with and SNC greater than or equal to the MAXIMUM_SCN are not applied by the Apply process

  • If the value is infinite (default), the Apply process applies changes without upper limit

OLDEST_SCN_NUM

  • This is the latest SCN of a received LCR that was successfully dequeued and applied

  • In the case where a point-in-time recovery is performed on the destination, this value should be used to reset the START_SCN for the associated Capture process at the source site to recapture changes

  • Does not pertain to synchronously captured changes received

Low-watermark SCN

  • The lowest SCN that can be guaranteed dequeued and applied by an Apply process

  • Corresponds to the APPLIED_SCN of the Capture process

There are a myriad other SCNs that have used the Apply process internally. The SCNs listed above are the ones you gain the most for understanding. You can find detailed information on Apply SCN and transaction tracking in the Oracle Streams Replication Administrators' Guide.

SCN SYNC-hronization

As you can see, if your SCNs are out of sync between the LogMiner Dictionary, Capture, and Apply processes your Streams may not work as expected; or even not at all. Obeying the following formula when implementing your Streams environment will keep you out of SCN SYNC-hole.

Note

Apply SCN >= OI SCN >= CP START_SCN >= CP FIRST_SCN

Where OI = Object Instantiation and CP = Capture

Once you have implemented Streams, avoid changes to SCNs unless it is necessary to compensate for a destination site point-in-time recovery or an unrecoverable archive log.

Capture checkpointing

The Capture process keeps track of the lowest SCN that has been recorded by its Propagation processes as greatest SCN that has been acknowledged by its Apply destination as being applied. The Capture process cannot set its REQUIRED_CHECKPIONT_SCN great than this SCN or there is a potential for data loss. This is controlled by checkpointing.

The Capture process will conduct checkpoints in which it coordinates its SCNs. By default these checkpoints happen with the capture of 10 MB of redo and the checkpoint metadata is retained in the database for 60 days. You can also force a checkpoint if the need arises. These checkpointing options are all controlled by the following capture parameters:

  • _CHECKPOINT_FREQUENCY: The number of megabytes captured which will trigger a checkpoint. Default value is 10 but can be changed with DBMS_CAPTURE_ADM.SET_PARAMETER().

  • CHECKPOINT_RETENTION_TIME: Number of days to retain checkpoint metadata. Default 60 but can be changed with DBMS_CAPTURE_ADM.ALTER_CAPTURE() procedure.

  • _CHECKPOINT_FORCE: This will force a Capture checkpoint. Accomplished via DBMS_CAPTURE_ADM.SET_PARAMETER, pass in Y for the value. It will set itself back to N when the checkpoint is complete.

You can determine the current values for these parameters by querying the DBA_CAPTURE_PARAMETERS view.

The following lists captures views that contain helpful information concerning your Capture processes:

  • DBA/ALL_CAPTURE

  • DBA/ALL_CAPTURE_PARAMETERS

  • DBA/ALL_CAPTURE_EXTRA_ATTRIBUTES

  • V$STREAMS_CAPTURE

  • V$STREAMS_TRANSACTION

  • DBA/ALL_STREAMS_RULES

Archive Log availability

When a Capture process starts up, it will check its REQUIRED_CHECKPOINT_SCN (if it's 0, it will use the FIRST_SCN) and look for the redo log that contains that SCN and begin scanning at the redo log forward. If the SCN is in an archived log that is no longer available; or if any subsequent redo logs (archived or online) from that SCN forward are no longer available, the Capture process will not start. You can overcome this by either of the following:

  • Restoring the required archive logs

  • Dropping and recreating the Capture process

This leads to the obvious question of "what happens when my archive logs are in my flash_recovery_area and are aged out?" The obvious answer here is, "It will break your Capture process if/when the archive log containing your Capture FIRST_SCN/REQUIRED_CHECKPOINT_SCN is aged out". This would be why Oracle documentation specifically and highly recommends that you do not use the flash_recovery_area as your only archive log repository if you are using Streams. If you use the flash_recovery_area, configure a separate archive log destination to accommodate the archive redo logs needed by Streams. Now, if you really want to only have archive logs in the flash_recovery_area, take pity on the on-call DBA and make sure that your Capture process checkpoint_retention_time intervals are set within the archive log retention period of the flash_recovery_area.

  • The following views can be used to help determine what archived redo logs are required by the Capture process and which can be purged:V$ARCHVIED_LOG

  • DBA_REGISTERED_ARCHIVED_LOG

  • DBA_LOGMNR_PURGED_LOG

Note

For more detailed information on flash_recovery_area and Streams, please refer to Are Required Redo Log Files Missing? section of Chapter 20 of the Oracle Streams Concepts and Administration guide.

 

LCRs what they are and how they work


Knowing how LCR moves from source to target is only part of the story. What an LCR contains is also important. Let's start by going over what we know about database transactions. Every transaction in a database is assigned a unique transaction ID. The transaction itself can be composed of one or more DML or DDL instructions.

Note

Most implicit DDL LCRs will have a single DDL instruction, due to the implicit commit nature of Oracle's handling of DDL.

Each one of these instructions is associated to its parent transaction via this ID. When we attempt to commit a transaction, all the instructions in the transaction must be successfully completed or the whole transaction fails/rolls back. This means that all the DML/DDL instructions within that transaction do not get applied to the database. Remember this. It will be important when you have to troubleshoot situations where a user demands "Where did my data go?"

As mentioned above, an LCR is a logical change record that is created by the Capture process. The content of the LCR is the actual steps the database took to accomplish the change instruction(s) of a transaction. These steps are stored in a special, ordered format that is then parsed by the Apply process to rebuild the SQL to duplicate the original transaction. We know that a transaction can have multiple instructions, thus, an LCR can include multiple steps. Each one of these steps is a message. When you look at the LCR metadata (where available; usually in error queues), you will see that each LCR has a message count, and that each message has a sequential ID.

The message itself is composed of metadata from which the Apply process builds the SQL to accomplish the instruction. This information includes (but is not limited to) the following:

  • Message ID/Sequence

  • Message type name: LCR or User Enqueued Message

  • Source database: where the LCR originated

  • Owner: Schema owner for the object/table which the message is changing

  • Object: Name of the object/table

  • Is Tag Null: Indicates if there are any tag values. ( Y means no tag values)

  • command_type:

    • If a DML message, this will be INSERT/UPDATE/DELETE/LOB_UPDATE

    • If a DDL message, this will be CREATE/ALTER/DROP/ and so on

  • Change Values:

    • If a DML message: You will see the old, new, and data type values for each field in the row

  • The values included depend on the command type:

    • Command type: INSERT, you will only see new values

    • Command type: DELETE, you will only see old values

    • Command type: UPDATE, you will see both old and new values

  • For special field data types, you may also see a typename value as well (such as timestamp). If a DDL message: you will see the actual command text.

There is additional information stored in LCRs. If you wish to familiarize yourself with the content of LCRs you can review the Types of Information Captured with Oracle Streams section in the Oracle Streams Concepts and Administration user's manual, and SYS.LCR$_ROW_RECORD and LCR$_DDL_RECORD type definitions found in the Oracle PL/SQL Packages and Types Reference manual. These types are visible to the user for use in explicit capture and are used by implicit capture as well.

Extracting data from an LCR

For regular, implicit Streams, you will most likely only need to extract data from an LCR in the event of an apply error. You would extract and review this data to determine what was in the LCR to help determine what caused the error. You can drill down to the LCRs in the Apply Error Queue using Enterprise Manager or you can create your own procedures that use Oracle APIs to extract the LCR data (we will go over this in more detail in Chapter 8, Dealing with the Ever Constant Tides of Change, dealing with how to administer and monitor Oracle 11g Streams).

Conflict detection and the LCR

In an Oracle replicated environment (Streams or Advanced), Conflict detection is always turned on. Conflict detection acts as a guard-dog to the LCR. When the Apply process attempts to apply the changes in an LCR, it first calls Conflict detection to verify that the change can be applied without the unexpected loss of data at the Apply site. Conflict detection identifies the row to be changed by the LCR. It then compares values in the LCR with the actual values in the existing row (if they exist). Depending on the change type, if certain values don't match (also known as data divergence), Conflict detection will attempt to find any conflict resolution rules assigned to the Apply process.

If none are found, or the conflict resolution rules do not resolve the conflict, the Conflict detection will not allow the change to be applied by raising an error to the Apply process. If this happens, the Apply process will place the LCR, along with the error raised by Conflict detection, in the Apply Error queue.

Note

If an LCR is placed in the Apply Error queue, the DML/DDL messages in that LCR have not been applied to the database object. This means all messages (DML/DDL instructions) in the LCR, not just the one(s) that failed. If you have multiple messages in the LCR, there may only be one message that fails, but the entire LCR transaction fails because of that one message failure. Keep this in mind when developing your transactions. The more messages you have in an LCR, the more difficult it is to determine which message(s) caused the failure.

If an LCR fails, all subsequent LCRs dependent on that failed LCR will also fail. This makes it very important to have as much understanding about how data changes will flow through your distributed environment before you implement production. If not carefully planned, all your changes could easily end up in your target error queue. It also makes it very important to faithfully monitor the Apply Error queues and address errors as quickly as possible.

The key to conflict detection and LCR playing nicely together is planning and conflict resolution. These activities are discussed in more detail in the following chapters.

Controlling conflict detection

As discussed earlier, conflict detection will compare all the values of all the columns by default. You do have some control on whether or not a non-key column value should be compared or can be ignored and when. This is accomplished with the DBMS_APPLY_ADM.COMPARE_OLD_VALUES procedure.

This procedure allows you specify a list of non-key columns in a table that are either included or excluded from conflict detection value comparison. Use this power with caution! Make sure you have identified all the ramifications to data convergence if you choose to exclude column values from conflict detection to avoid unexpected data loss.

The key term is is "non-key columns". The DBMS_APPLY_ADM.COMPARE_OLD_VALUES procedure will not let you exclude key columns. It will raise an error. If you absolutely, positively, without question, must exclude a key column from conflict detection, you will need to redefine the table's key column list using the DBMS_APPLY_ADM.SET_KEY_COLUMNS. Again, use this with reserve.

Note

For more detailed information on Conflict Detection control, please reference the Streams Conflict Resolution chapter in the Oracle Streams Concepts and Administration Guide, and the Oracle PL/SQL Reference and Types manual and Administrators' Guide.

Types of LCRs and how they get created

The method used to create an LCR determines the LCR type.

  • If an LCR is created by an asynchronous Capture process (implicitly) it is a captured LCR

  • If the LCR is created by a user application (explicitly), by a synchronous Capture process, or enqueued by an Apply process, it is a persistent LCR

  • If an LCR is explicitly created by an application and enqueued with delivery_mode set to BUFFERED, it is a buffered LCR

 

Oracle 11g memory and storage architecture (basic) relating to Streams


At this point, we want to remind you that Streams interacts with the Oracle database architecture in many different ways. Interaction and changes to the SGA and SPFILE should be done prior to configuration of Streams. If subsequent changes are needed, make sure to review those changes. The isolation of tables related to LogMiner (already mentioned above) is also part of the pre-work that should be done as part of the Streams configurations. The location of the redo logs and archive logs and the retrieval speed from disk or disk cache should also be considered. The retention time of archived logs must be coordinated with Capture process SCN requirements.

The use of Automatic Memory Management (AMM) or Dynamic SGA is suggested when configuring Streams to ease administration. One of the parameters that will need to be configured is STREAMS_POOL_SIZE. STREAMS_POOL_SIZE controls the size of the Streams Pool in the SGA memory. A properly sized STREAMS_POOL_SIZE allows for proper performance of the Capture and Apply processes. Streams Pool also stores LCRs (or messages) in buffered queues. If the Streams Pool is undersized, you can see issues with "Spilled transactions" for the Apply. This means that the Apply process ran out of Streams Pool and had to write the LCRs to disk until they could be loaded back to memory. This is where an undersized Streams Pool can have a significant impact on Streams performance. If this happens, you will see error messages in the alert log, as well as entries in the V$BUFFERED_QUEUES.

If you see entries in the DBA_APPLY_SPILL_TXN view, this is a result of transaction size or age exceeding the Apply process txn_lcr_spill_threshold and txn_age_spill_threshold parameter values respectively.

We will go into details about configuration of the database in Chapter 3. For now just be aware that the database needs to be configured specifically for a Streams environment.

 

A word on performance


We have briefly mentioned some performance-related concerns when setting up a Streams environment. Having robust hardware and a fast, stable network greatly affects overall Streams performance. Configuration of the database will also have an impact on performance. This chapter was more about understanding the key components of Streams. In Chapter 7, Document What You Have and How it is Working, we will go into detail about Streams performance.

 

Streams Change tables


The current ethical climate of computing unfortunately mandates the need to identify who made what changes to what data, when and from where. Corporations must now comply with stringent data change auditing mandates associated with such regulations as SOX (Sarbanes Oxley Act), FISMA (Federal Information Security Management Act); to name a couple. Prior to Oracle 11gR2, the Capture and Propagation of data change audit information had to be manually included in Streamed environments. Oracle 11gR2 introduces the DBMS_STREAMS_ADM.MAINTAIN_CHANGE_TABLE procedure that allows the DBA to quickly configure a separate change audit table for a Streamed table, as well as to propagate the change audit data from the source site to all target destination sites. This procedure can also be used to create one-way replication of a table along with the change capture from a source to a destination database. Change tables can be implemented for local or downstream capture, and local or remote apply configurations.

The DBMS_STREAMS_ADM.MAINTAIN_CHANGE_TABLE procedure creates all the components necessary to capture, send, and record data change information to the change table.

The DBMS_STREAMS_ADM.MAINTAIN_CHANGE_TABLE procedure is run at the capture site and accomplishes the following:

  • Creates a separate change table for the change audit data

    • The change table can be located in the same database or a remote database

    • The change table columns tracked for its source table are based on the column_type_list

    • Additional audit data columns that can be added to the change table include:

      • value_type

      • source_database_name

      • command_type

      • object_owner

      • object_name

      • tag

      • transaction_id

      • scn

      • commit_scn

      • compatible

      • instance_number

      • message_number

      • row_text

      • row_id

      • serial#

      • session#

      • source_time

      • thread#

      • tx_name

      • username

  • Creates a Capture process to capture source table changes

  • Creates a Propagation process for remote apply

  • Creates an Apply process at the destination site

  • Creates and adds DML handlers to the specified Apply process that record change information in the change table based on row LCR information

  • Configures the Apply process to execute the row LCR on the remote source table if EXECUTE_LCR is TRUE

    This supports the optional one-way replication.

The following figure shows configuration with local capture, remote Apply with Source table replication.

Before configuring change tables, you want to make decisions as to:

  • The type of environment to configure

  • The source table columns to track

  • If/what metadata to record

  • The values to Track for Update Operations (old, new)

  • Whether to configure a KEEP_COLUMNS transformation

  • Whether to specify CREATE TABLE options for the Change Table

  • Whether to perform the Configuration Actions Directly or with a Script

  • Whether to replicate the Source Table as well

Note

For more information on Streams Change Tables, please refer to Chapter 20, Using Oracle Streams to Record Table Changes in the Oracle Streams Concepts and Administration 11g Release 2 Guide.

For more information on the MAINTAIN_CHANGE_TABLE procedure, please reference the Oracle PL/SQL Packages and Types Reference DBMS_STREAMS_ADM: subprogram MAINTAIN_CHANGE_TABLE.

 

Oracle GoldenGate XSTREAMS


With the acquisition of GoldenGate, Oracle 11gR2 incorporates GoldenGate XSTREAMS technology providing client applications with the ability to insert and extract LCR datatypes directly into an implicit or explicit Stream.

Note

To use the XSTREAMS API's, you must purchase an Oracle GoldenGate product license. See the Oracle GoldenGate documentation at http://download.oracle.com/docs/cd/E15881_01/index.htm.

The following lists views that supply helpful information for XSTREAMS:

  • ALL_APPLY

  • ALL_APPLY_ERROR

  • ALL_XSTREAM_INBOUND

  • ALL_XSTREAM_INBOUND_PROGRESS

  • ALL_XSTREAM_OUTBOUND

  • ALL_XSTREAM_OUTBOUND_PROGRESS

  • ALL_XSTREAM_RULES

  • DBA_APPLY

  • DBA_APPLY_ERROR

  • DBA_APPLY_SPILL_TXN

  • DBA_XSTREAM_INBOUND

  • DBA_XSTREAM_INBOUND_PROGRESS

  • DBA_XSTREAM_OUTBOUND

  • DBA_XSTREAM_OUTBOUND_PROGRESS

  • DBA_XSTREAM_RULES

XTREAMS is built on Oracle Streams infrastructure and thus can take advantage of Oracle Streams' functionality and flexibility.

The Oracle database is configured to support XSTREAMS using the DBMS_XSTREAMS_ADM package subprograms. Specialized server processes are configured to handle inbound and outbound traffic to standard Oracle Streams queues. Outbound server processes can be created to dequeue from an existing Streams queue, or its own queue. Inbound server processes are created with their own Streams queue which can then be configured for use via normal Streams apply rule creation. These server processes are accessed via the OCI or Java API function interfaces.

XSTREAMS provides the ability to share information across a heterogeneous environment with excellent flexibility, usability, and performance. This functionality can be used as an alternative to replicating with Oracle Heterogeneous Gateways, and Java Messaging Service(JMS) clients.

Note

For more information on XSTREAMS, please reference the Oracle Database XSTREAMS Guide.

Special Documentation Notes:

The DBMS_XSTREAM_ADM package is documented in the Oracle Database XSTREAMS Guide rather than the Oracle PL/SQL Packages and Types Reference.

XSTREAM OCI interfaces are found in the Oracle Database XSTREAMS Guide.

XSTREAM Java API's are found in the XSTREAM Java API Reference.

 

Summary


This is the foundation chapter for the rest of this book. We went over the basics of what Streams can do and the background processes that make up a Streams environment. Those background processes being Capture, Propagate, and Apply. Knowing each of these processes well is crucial to implementing a robust production level Streams environment. Not knowing these background processes can cause you difficulties when setting up Streams and future problems. Instantiation was covered in this chapter to point out not only what it is but to also stress its importance in the Streams environment and its role in SCN synchronization.

Streams is also dependent on the hardware and network that it runs on. Having properly sized hardware is within a DBA's influence. The network is often beyond the DBA's control. As the network is so crucial to Streams, we included discussion on the impacts of network speed and stability on Streams performance.

We also went into what an LCR is and its function. An LCR contains the DMLs and/or DDLs that are sent from Source to Target. Not all DMLs and DDLs are supported by Streams and we showed how to quickly identify what objects are supported and those that are not (query DBA_STREAMS_UNSUPPORTED).

We began to go into the Oracle Architecture (and components needed to support Streams) and Streams Performance. Chapter 7 will cover capturing Streams performance statics because it deserves its own special place. We will go into detail about how to configure the database in Chapter 3 to support optimal performance. Finally, we focus on two new Oracle 11gR2 features: Streams Change tables and XSTREAMS.

Now that we have reviewed the components of Oracle Streams Replication, let us move on to Chapter 2 where we will explore the many, many things to consider and options available when designing your Streamed environment.

About the Authors

  • Ann L. R. McKinnell

    Ann McKinnell is currently a Senior Principal Consultant with APG Technologies, LLC and has been an OCP since Oracle 7.3.4. She came to APG with over 8 years as a senior technical member of Oracle Global Support, specializing in Replication and Distributed System technologies. She was a recognized global technical expert for Oracle Distributed Systems; earning the internal nickname "The Replication Goddess". Ann has trained Oracle Support and Consulting personnel from many countries in Advanced Replication and Distributed System Internals and Problem Solving techniques. She has authored and co-authored many of the Oracle Advanced Replication notes found on Oracle Metalink, and was a technical reviewer for the Oracle University Pilot 9i Steams course material, as well as various Oracle Replication and Database Administration user manuals. Ann continues to specialize in practical implementation strategies and the development of distributed Oracle database systems, database architecture, and software and database design and integration.

    Browse publications by this author
  • Eric Yen

    Eric Yen began working with Oracle Databases at version 7.3.4. Over the following 14 years, he obtained his Oracle DBA Certification starting with version 8, maintaining it up to the current release, and he has also earned the (ISC)2 CISSP certification. Eric began working with Oracle Streams with Oracle 9i Streams beta. As a Senior Principal Consultant with APG Technologies, LLC, Eric's work includes designing and implementing Streams solutions for Government clients using the more recent versions of Streams in Oracle 10 and Oracle 11. During his little spare time, you can find Eric exercising and tinkering around with Oracle products.

    Browse publications by this author
Book Title
Unlock this full book FREE 10 day trial
Start Free Trial