Oracle GoldenGate 11g: Performance Tuning

Exclusive offer: get 50% off this eBook here
Oracle GoldenGate 11g Implementer's guide

Oracle GoldenGate 11g Implementer's guide — Save 50%

Design, install, and configure high-performance data replication solutions with this Oracle GoldenGate book and eBook

$32.99    $16.50
by John P Jeffries | March 2011 | Enterprise Articles Oracle

Performance tuning is one of the main aspects of any IT project. Many leave it to the end and then realize that it is not possible to make the necessary changes without significant additional investment or time constraints. Performance must be considered at the beginning and throughout the lifetime of your project. Closely coupled to the design, this article hones in on individual performance tuning methods.

In this article by John P. Jeffries, author of Oracle GoldenGate Implementer's Guide, we learn the following:

  • Balancing load across multiple processes
  • Splitting large or transaction intensive tables across parallel process groups
  • Adding additional Replicats to a process group
  • Improving Replicat throughput by reducing commit delay
  • Exploring the GoldenGate 11.1.1 new features
  • Tuning the network

 

Oracle GoldenGate 11g Implementer's guide

Oracle GoldenGate 11g Implementer's guide

Design, install, and configure high-performance data replication solutions using Oracle GoldenGate

  • The very first book on GoldenGate, focused on design and performance tuning in enterprise-wide environments
  • Exhaustive coverage and analysis of all aspects of the GoldenGate software implementation, including design, installation, and advanced configuration
  • Migrate your data replication solution from Oracle Streams to GoldenGate
  • Design a GoldenGate solution that meets all the functional and non-functional requirements of your system
  • Written in a simple illustrative manner, providing step-by-step guidance with discussion points
  • Goes way beyond the manual, appealing to Solution Architects, System Administrators and Database Administrators
        Read more about this book      

(For more resources on Oracle, see here.)

Oracle states that GoldenGate can achieve near real-time data replication. However, out of the box, GoldenGate may not meet your performance requirements. Here we focus on the main areas that lend themselves to tuning, especially parallel processing and load balancing, enabling high data throughput and very low latency.

Let's start by taking a look at some of the considerations before we start tuning Oracle GoldenGate.

Before tuning GoldenGate

There are a number of considerations we need to be aware of before we start the tuning process. For one, we must consider the underlying system and its ability to perform. Let's start by looking at the source of data that GoldenGate needs for replication to work the online redo logs.

Online redo

Before we start tuning GoldenGate, we must look at both the source and target databases and their ability to read/write data. Data replication is I/O intensive, so fast disks are important, particularly for the online redo logs. Redo logs play an important role in GoldenGate: they are constantly being written to by the database and concurrently being read by the Extract process. Furthermore, adding supplemental logging to a database can increase their size by a factor of 4!

Firstly, ensure that only the necessary amount of supplemental logging is enabled on the database. In the case of GoldenGate, the logging of the Primary Key is all that is required.

Next, take a look at the database wait events, in particular the ones that relate to redo. For example, if you are seeing "Log File Sync" waits, this is an indicator that either your disk writes are too slow or your application is committing too frequently, or a combination of both. RAID5 is another common problem for redo log writes. Ideally, these files should be placed on their own mirrored storage such as RAID1+0 (mirrored striped sets) or Flash disks. Many argue this to be a misconception with modern high speed disk arrays, but some production systems are still known to be suffering from redo I/O contention on RAID5.

An adequate number (and size) of redo groups must be configured to prevent "checkpoint not complete" or "cannot allocate new log" warnings appearing in the database instance alert log. This occurs when Oracle attempts to reuse a log file but the checkpoint that would flush the blocks in the DB buffer cache to disk are still required for crash recovery. The database must wait until that checkpoint completes before the online redolog file can be reused, effectively stalling the database and any redo generation.

Large objects (LOBs)

Know your data. LOBs can be a problem in data replication by virtue of their size and the ability to extract, transmit, and deliver the data from source to target. Tables containing LOB datatypes should be isolated from regular data to use a dedicated Extract, Data Pump, and Replicat process group to enhance throughput. Also ensure that the target table has a primary key to avoid Full Table Scans (FTS), an Oracle GoldenGate best practice. LOB INSERT operations can insert an empty (null) LOB into a row before updating it with the data. This is because a LOB (depending on its size) can spread its data across multiple Logical Change Records, resulting in multiple DML operations required at the target database.

Base lining

Before we can start tuning, we must record our baseline. This will provide a reference point to tune from. We can later look back at our baseline and calculate the percentage improvement made from deploying new configurations.

An ideal baseline is to find the "breaking point" of your application requirements. For example, the following questions must be answered:

  1. What is the maximum acceptable end to end latency?
  2. What are the maximum application transactions per second we must accommodate?

To answer these questions we must start with a single threaded data replication configuration having just one Extract, one Data Pump, and one Replicat process. This will provide us with a worst case scenario in which to build improvements on.

Ideally, our data source should be the application itself, inserting, deleting, and updating "real data" in the source database. However, simulated data with the ability to provide throughput profiles will allow us to gauge performance accurately Application vendors can normally provide SQL injector utilities that simulate the user activity on the system.

Balancing the load across parallel process groups

The GoldenGate documentation states "The most basic thing you can do to improve GoldenGate's performance is to divide a large number of tables among parallel processes and trails. For example, you can divide the load by schema".This statement is true as the bottleneck is largely due to the serial nature of the Replicat process, having to "replay" transactions in commit order. Although this can be a constraining factor due to transaction dependency, increasing the number of Replicat processes increases performance significantly. However, it is highly recommended to group tables with referential constraints together per Replicat.

The number of parallel processes is typically greater on the target system compared to the source. The number and ratio of processes will vary across applications and environments. Each configuration should be thoroughly tested to determine the optimal balance, but be careful not to over allocate, as each parallel process will consume up to 55MB. Increasing the number of processes to an arbitrary value will not necessarily improve performance, in fact it may be worse and you will waste CPU and memory resources.

The following data flow diagram shows a load balancing configuration including two Extract processes, three Data Pump, and five Replicats:

Oracle GoldenGate 11g

Considerations for using parallel process groups

To maintain data integrity, ensure to include tables with referential constraints between one another in the same parallel process group. It's also worth considering disabling referential constraints on the target database schema to allow child records to be populated before their parents, thus increasing throughput. GoldenGate will always commit transactions in the same order as the source, so data integrity is maintained.

Oracle best practice states no more than 3 Replicat processes should read the same remote trail file. To avoid contention on Trail files, pair each Replicat with its own Trail files and Extract process. Also, remember that it is easier to tune an Extract process than a Replicat process, so concentrate on your source before moving your focus to the target.

Splitting large tables into row ranges across process groups

What if you have some large tables with a high data change rate within a source schema and you cannot logically separate them from the remaining tables due to referential constraints? GoldenGate provides a solution to this problem by "splitting" the data within the same schema via the @RANGE function. The @RANGE function can be used in the Data Pump and Replicat configuration to "split" the transaction data across a number of parallel processes.

The Replicat process is typically the source of performance bottlenecks because, in its normal mode of operation, it is a single-threaded process that applies operations one at a time by using regular DML. Therefore, to leverage parallel operation and enhance throughput, the more Replicats the better (dependant on the number of CPUs and memory available on the target system).

The RANGE function

The way the @RANGE function works is it computes a hash value of the columns specified in the input. If no columns are specified, it uses the table's primary key. GoldenGate adjusts the total number of ranges to optimize the even distribution across the number of ranges specified. This concept can be compared to Hash Partitioning in Oracle tables as a means of dividing data.

With any division of data during replication, the integrity is paramount and will have an effect on performance. Therefore, tables having a relationship with other tables in the source schema must be included in the configuration. If all your source schema tables are related, you must include all the tables!

Adding Replicats with @RANGE function

The @RANGE function accepts two numeric arguments, separated by a comma:

  1. Range: The number assigned to a process group, where the first is 1 and the second 2 and so on, up to the total number of ranges.
  2. Total number of ranges: The total number of process groups you wish to divide using the @RANGE function.

The following example includes three related tables in the source schema and walks through the complete configuration from start to finish.

For this example, we have an existing Replicat process on the target machine (dbserver2) named ROLAP01 that includes the following three tables:

  • ORDERS
  • ORDER_ITEMS
  • PRODUCTS

We are going to divide the rows of the tables across two Replicat groups. The source database schema name is SRC and target schema TGT. The following steps add a new Replicat named ROLAP02 with the relevant configuration and adjusts Replicat ROLAP01 parameters to suit.

Note that before conducting any changes stop the existing Replicat processes and determine their Relative Byte Address (RBA) and Trail file log sequence number. This is important information that we will use to tell the new Replicat process from which point to start.

  1. First check if the existing Replicat process is running:

    GGSCI (dbserver2) 1> info all

    Program Status Group Lag Time Since Chkpt
    MANAGER RUNNING
    REPLICAT RUNNING ROLAP01 00:00:00 00:00:02

  2. Stop the existing Replicat process:
    GGSCI (dbserver2) 2> stop REPLICAT ROLAP01
    Sending STOP request to REPLICAT ROLAP01...
    Request processed.
  3. Add the new Replicat process, using the existing trail file.
    GGSCI (dbserver2) 3> add REPLICAT ROLAP02, exttrail ./dirdat/tb
    REPLICAT added.
  4. Now add the configuration by creating a new parameter file for ROLAP02.
    GGSCI (dbserver2) 4> edit params ROLAP02

    --
    -- Example Replicator parameter file to apply changes
    -- to target tables
    --
    REPLICAT ROLAP02
    SOURCEDEFS ./dirdef/mydefs.def
    SETENV (ORACLE_SID= OLAP)
    USERID ggs_admin, PASSWORD ggs_admin
    DISCARDFILE ./dirrpt/rolap02.dsc, PURGE
    ALLOWDUPTARGETMAP
    CHECKPOINTSECS 30
    GROUPTRANSOPS 2000

    MAP SRC.ORDERS, TARGET TGT.ORDERS, FILTER (@RANGE (1,2));
    MAP SRC.ORDER_ITEMS, TARGET TGT.ORDER_ITEMS, FILTER (@RANGE
    (1,2));
    MAP SRC.PRODUCTS, TARGET TGT.PRODUCTS, FILTER (@RANGE (1,2));
  5. Now edit the configuration of the existing Replicat process, and add the @RANGE function to the FILTER clause of the MAP statement. Note the inclusion of the GROUPTRANSOPS parameter to enhance performance by increasing the number of operations allowed in a Replicat transaction.
    GGSCI (dbserver2) 5> edit params ROLAP01

    --
    -- Example Replicator parameter file to apply changes
    -- to target tables
    --
    REPLICAT ROLAP01
    SOURCEDEFS ./dirdef/mydefs.def
    SETENV (ORACLE_SID=OLAP)
    USERID ggs_admin, PASSWORD ggs_admin
    DISCARDFILE ./dirrpt/rolap01.dsc, PURGE
    ALLOWDUPTARGETMAP
    CHECKPOINTSECS 30
    GROUPTRANSOPS 2000
    MAP SRC.ORDERS, TARGET TGT.ORDERS, FILTER (@RANGE (2,2));
    MAP SRC.ORDER_ITEMS, TARGET TGT.ORDER_ITEMS, FILTER (@RANGE
    (2,2));
    MAP SRC.PRODUCTS, TARGET TGT.PRODUCTS, FILTER (@RANGE (2,2));
  6. Check that both the Replicat processes exist.

    GGSCI (dbserver2) 6> info all
    Program Status Group Lag Time Since Chkpt
    MANAGER RUNNING
    REPLICAT STOPPED ROLAP01 00:00:00 00:10:35
    REPLICAT STOPPED ROLAP02 00:00:00 00:12:25

  7. Before starting both Replicat processes, obtain the log Sequence Number (SEQNO) and Relative Byte Address (RBA) from the original trail file.

    GGSCI (dbserver2) 7> info REPLICAT ROLAP01, detail
    REPLICAT ROLAP01 Last Started 2010-04-01 15:35 Status STOPPED
    Checkpoint Lag 00:00:00 (updated 00:12:43 ago)
    Log Read Checkpoint File ./dirdat/tb000279 <- SEQNO
    2010-04-08 12:27:00.001016 RBA 43750979 <- RBA
    Extract Source Begin End
    ./dirdat/tb000279 2010-04-01 12:47 2010-04-08 12:27
    ./dirdat/tb000257 2010-04-01 04:30 2010-04-01 12:47
    ./dirdat/tb000255 2010-03-30 13:50 2010-04-01 04:30
    ./dirdat/tb000206 2010-03-30 13:50 First Record
    ./dirdat/tb000206 2010-03-30 04:30 2010-03-30 13:50
    ./dirdat/tb000184 2010-03-30 04:30 First Record
    ./dirdat/tb000184 2010-03-30 00:00 2010-03-30 04:30
    ./dirdat/tb000000 *Initialized* 2010-03-30 00:00
    ./dirdat/tb000000 *Initialized* First Record

  8. Adjust the new Replicat process ROLAP02 to adopt these values, so that the process knows where to start from on startup.
    GGSCI (dbserver2) 8> alter replicat ROLAP02, extseqno 279
    REPLICAT altered.

    GGSCI (dbserver2) 9> alter replicat ROLAP02, extrba 43750979
    REPLICAT altered.

    Failure to complete this step will result in either duplicate data or ORA-00001 against the target schema, because GoldenGate will attempt to replicate the data from the beginning of the initial trail file (./dirdat/tb000000) if it exists, else the process will abend.

  9. Start both Replicat processes. Note the use of the wildcard (*).
    GGSCI (dbserver2) 10> start replicat ROLAP*

    Sending START request to MANAGER ...
    REPLICAT ROLAP01 starting

    Sending START request to MANAGER ...
    REPLICAT ROLAP02 starting
  10. Check if both Replicat processes are running.

    GGSCI (dbserver2) 11> info all
    Program Status Group Lag Time Since Chkpt
    MANAGER RUNNING
    REPLICAT RUNNING ROLAP01 00:00:00 00:00:22
    REPLICAT RUNNING ROLAP02 00:00:00 00:00:14

  11. Check the detail of the new Replicat processes.

    GGSCI (dbserver2) 12> info REPLICAT ROLAP02, detail
    REPLICAT ROLAP02 Last Started 2010-04-08 14:18 Status RUNNING
    Checkpoint Lag 00:00:00 (updated 00:00:06 ago)
    Log Read Checkpoint File ./dirdat/tb000279
    First Record RBA 43750979
    Extract Source Begin End
    ./dirdat/tb000279 * Initialized * First Record
    ./dirdat/tb000279 * Initialized * First Record
    ./dirdat/tb000279 * Initialized * 2010-04-08 12:26
    ./dirdat/tb000279 * Initialized * First Record

  12. Generate a report for the new Replicat process ROLAP02.
    GGSCI (dbserver2) 13> send REPLICAT ROLAP02, report

    Sending REPORT request to REPLICAT ROLAP02 ...
    Request processed.
  13. Now view the report to confirm the new Replicat process has started from the specified start point. (RBA 43750979 and SEQNO 279). The following is an extract from the report:
    GGSCI (dbserver2) 14> view report ROLAP02
    2010-04-08 14:20:18 GGS INFO 379 Positioning with begin
    time: Apr 08, 2010 14:18:19 PM, starting record time: Apr 08, 2010
    14:17:25 PM at extseqno 279, extrba 43750979.

 

Oracle GoldenGate 11g Implementer's guide Design, install, and configure high-performance data replication solutions with this Oracle GoldenGate book and eBook
Published: February 2011
eBook Price: $32.99
Book Price: $54.99
See more
Select your format and quantity:

 

        Read more about this book      

(For more resources on Oracle, see here.)

Configuring multiple parallel process groups

Taking the parallel concept one step further, we can not only add parallel process groups at the target, but also at the source. The diagram at the start of this article gives an overview configuration, showing multiple parallel processes from the Data Pumps on the source, feeding data to dedicated Replicat processes, where the data is "split" again into more parallel threads.

Based on the following diagram, we will be configuring multiple process groups from scratch for 1 Extract process to enhance transaction throughput and reduce lag times:

Oracle GoldenGate 11g

In the following configuration example, the source database name is OLTP and its source schema name is SRC. The target database name is OLAP and its target schema name is TGT.

Source system configuration

The following section specifies an example configuration for parallel data processing that will be used to leverage the performance of the GoldenGate data capture and routing process.

Parallel process parameter files

  1. Place the following EXTRACT.prm file in the dirprm sub-directory of the GoldenGate home on the source system:
    -- EXTRACT1.prm
    --
    -- Change Capture parameter file to capture
    -- SRC table changes
    --
    EXTRACT EXTRACT1
    SETENV (ORACLE_SID=oltp)
    USERID ggs_admin, PASSWORD ggs_admin
    EXTTRAIL ./dirdat/sa
    TRANLOGOPTIONS ASMUSER SYS@ASM, ASMPASSWORD Welcome
    TABLE SRC.CREDITCARD_ACCOUNTS;
    TABLE SRC.CREDITCARD_PAYMENTS;
    TABLE SRC.CREDITCARD_PAYMENTS_HISTORY;
    TABLE SRC.CREDITCARD_PAYMENTS_STATUS;
  2. Place the following DATAPMP1.prm and DATAPMP2.prm files in the dirprm sub-directory of the GoldenGate home on the source system:
    -- DATAPMP1.prm
    --
    -- Data Pump parameter file to read the local
    -- trail sa for SRC table changes and write to
    -- remote trail ta
    --
    EXTRACT DATAPMP1
    SETENV (ORACLE_SID=oltp)
    USERID ggs_admin, PASSWORD ggs_admin
    RMTHOST dbserver2, MGRPORT 7810
    RMTTRAIL ./dirdat/ta
    TABLE SRC.CREDITCARD_ACCOUNTS, FILTER (@RANGE (1,2));
    TABLE SRC.CREDITCARD_PAYMENTS, FILTER (@RANGE (1,2));
    TABLE SRC.CREDITCARD_PAYMENTS_HISTORY, FILTER (@RANGE (1,2));
    TABLE SRC.CREDITCARD_PAYMENTS_STATUS, FILTER (@RANGE (1,2));
    -- DATAPMP2.prm
    --
    -- Data Pump parameter file to read the local
    -- trail sa for SRC table changes and write to
    -- remote trail tb
    --
    EXTRACT DATAPMP2
    SETENV (ORACLE_SID=oltp)
    USERID ggs_admin, PASSWORD ggs_admin
    RMTHOST dbserver2, MGRPORT 7810
    RMTTRAIL ./dirdat/tb
    TABLE SRC.CREDITCARD_ACCOUNTS, FILTER (@RANGE (2,2));
    TABLE SRC.CREDITCARD_PAYMENTS, FILTER (@RANGE (2,2));
    TABLE SRC.CREDITCARD_PAYMENTS_HISTORY, FILTER (@RANGE (2,2));
    TABLE SRC.CREDITCARD_PAYMENTS_STATUS, FILTER (@RANGE (2,2));

Now that the Extract process parameter files have been created in the dirprm subdirectory, we can create the associated Extract process groups.

Creating extract parallel process groups

  1. Use the following Obey file to prepare and configure your source system for multiple Data Pump process groups:
    -- config_source.oby
    -- Database Authentication Connection
    DBLOGIN USERID ggs_admin@oltp, PASSWORD ggs_admin
    -- Turning on Data Capture Changes on all Tables
    ADD TRANDATA SRC.CREDITCARD_ACCOUNTS
    ADD TRANDATA SRC.CREDITCARD_PAYMENTS
    ADD TRANDATA SRC.CREDITCARD_PAYMENTS_HISTORY
    ADD TRANDATA SRC.CREDITCARD_PAYMENTS_STATUS
    -- Verify that supplemental log has been switched on
    INFO TRANDATA SRC.CREDITCARD_ACCOUNTS
    INFO TRANDATA SRC.CREDITCARD_PAYMENTS
    INFO TRANDATA SRC.CREDITCARD_PAYMENTS_HISTORY
    INFO TRANDATA SRC.CREDITCARD_PAYMENTS_STATUS
    -- Adding the extract group for the capture
    ADD EXTRACT extract1, TRANLOG, BEGIN NOW
    -- Defining the local trail files for capture
    ADD EXTTRAIL ./dirdat/sa, EXTRACT extract1, MEGABYTES 500
    -- Check status of all running processes
    INFO ALL
    -- Adding the extract group for the pump
    ADD EXTRACT datapmp1, EXTTRAILSOURCE ./dirdat/sa
    -- Defining the remote trail files for pump
    ADD RMTTRAIL ./dirdat/ta, EXTRACT datapmp1, MEGABYTES 500
    -- Adding the extract group for the pump
    ADD EXTRACT datapmp2, EXTTRAILSOURCE ./dirdat/sb
    -- Defining the remote trail files for pump
    ADD RMTTRAIL ./dirdat/tb, EXTRACT datapmp2, MEGABYTES 500
    -- Start extract and data pump processes
    START EXTRACT *
    -- Check status of all running processes
    INFO ALL
  2. Place the config_source.oby file in the dirprm sub-directory of the GoldenGate home on the source system.
  3. To execute the Obey file, call it from the GGSCI command line.
    GGSCI (dbserver1) 1> obey ./dirprm/config_source.oby
  4. Check if the Extract processes are running.

    GGSCI (dbserver1) 93> info all
    Program Status Group Lag Time Since Chkpt
    MANAGER RUNNING
    EXTRACT RUNNING DATAPMP1 00:00:00 00:00:03
    EXTRACT RUNNING DATAPMP2 00:00:00 00:00:02
    EXTRACT RUNNING EXTRACT1 00:00:00 00:00:03

That concludes the source system configuration. From the example output, we can see that the Extract processes have been started and are running. Let's now configure the target system.

Target system configuration

The following section specifies an example configuration for parallel data processing that will be used to leverage the performance of the GoldenGate data delivery process.

Parallel process parameter files

  1. Place the following REPLCAT1.prm, REPLCAT2.prm, REPLCAT3.prm and REPLCAT4.prm files in the dirprm sub-directory of the GoldenGate home on the target system:
    -- REPLCAT1.prm
    --
    -- Replicator parameter file to read remote trail ta
    -- and apply changes to TGT tables
    --
    REPLICAT REPLCAT1
    SOURCEDEFS ./dirdef/oltp.def
    SETENV (ORACLE_SID=olap)
    USERID ggs_admin, PASSWORD ggs_admin
    DISCARDFILE ./dirrpt/replcat1.dsc, PURGE
    ALLOWDUPTARGETMAP
    CHECKPOINTSECS 30
    GROUPTRANSOPS 2000
    BATCHSQL
    MAP SRC.CREDITCARD_ACCOUNTS, TARGET TGT.CREDITCARD_ACCOUNTS,
    FILTER (@RANGE (1,2));
    MAP SRC.CREDITCARD_PAYMENTS, TARGET TGT.CREDITCARD_PAYMENTS,
    FILTER (@RANGE (1,2));
    MAP SRC.CREDITCARD_PAYMENTS_HISTORY, TARGET TGT.CREDITCARD_
    PAYMENTS_HISTORY, FILTER (@RANGE (1,2));
    MAP SRC.CREDITCARD_PAYMENTS_STATUS, TARGET TGT.CREDITCARD_
    PAYMENTS_STATUS, FILTER (@RANGE (1,2));

    -- REPLCAT2.prm
    --
    -- Replicator parameter file to read remote trail ta
    -- and apply changes to TGT tables
    --
    REPLICAT REPLCAT2
    SOURCEDEFS ./dirdef/oltp.def
    SETENV (ORACLE_SID=olap)
    USERID ggs_admin, PASSWORD ggs_admin
    DISCARDFILE ./dirrpt/replcat2.dsc, PURGE
    ALLOWDUPTARGETMAP
    CHECKPOINTSECS 30
    GROUPTRANSOPS 2000
    BATCHSQL
    MAP SRC.CREDITCARD_ACCOUNTS, TARGET TGT.CREDITCARD_ACCOUNTS,
    FILTER (@RANGE (2,2));
    MAP SRC.CREDITCARD_PAYMENTS, TARGET TGT.CREDITCARD_PAYMENTS,
    FILTER (@RANGE (2,2));
    MAP SRC.CREDITCARD_PAYMENTS_HISTORY, TARGET TGT.CREDITCARD_
    PAYMENTS_HISTORY, FILTER (@RANGE (2,2));
    MAP SRC.CREDITCARD_PAYMENTS_STATUS, TARGET TGT.CREDITCARD_
    PAYMENTS_STATUS, FILTER (@RANGE (2,2));

    -- REPLCAT3.prm
    --
    -- Replicator parameter file to read remote trail tb
    -- and apply changes to TGT tables
    --
    REPLICAT REPLCAT3
    SOURCEDEFS ./dirdef/oltp.def
    SETENV (ORACLE_SID=olap)
    USERID ggs_admin, PASSWORD ggs_admin
    DISCARDFILE ./dirrpt/replcat3.dsc, PURGE
    ALLOWDUPTARGETMAP
    CHECKPOINTSECS 30
    GROUPTRANSOPS 2000
    BATCHSQL
    MAP SRC.CREDITCARD_ACCOUNTS, TARGET TGT.CREDITCARD_ACCOUNTS,
    FILTER (@RANGE (1,2));
    MAP SRC.CREDITCARD_PAYMENTS, TARGET TGT.CREDITCARD_PAYMENTS,
    FILTER (@RANGE (1,2));
    MAP SRC.CREDITCARD_PAYMENTS_HISTORY, TARGET TGT.CREDITCARD_
    PAYMENTS_HISTORY, FILTER (@RANGE (1,2));
    MAP SRC.CREDITCARD_PAYMENTS_STATUS, TARGET TGT.CREDITCARD_
    PAYMENTS_STATUS, FILTER (@RANGE (1,2));

    -- REPLCAT4.prm
    --
    -- Replicator parameter file to read remote trail tb
    -- and apply changes to TGT tables
    --
    REPLICAT REPLCAT4
    SOURCEDEFS ./dirdef/oltp.def
    SETENV (ORACLE_SID=olap)
    USERID ggs_admin, PASSWORD ggs_admin
    DISCARDFILE ./dirrpt/replcat4.dsc, PURGE
    ALLOWDUPTARGETMAP
    CHECKPOINTSECS 30
    GROUPTRANSOPS 2000
    BATCHSQL
    MAP SRC.CREDITCARD_ACCOUNTS, TARGET TGT.CREDITCARD_ACCOUNTS,
    FILTER (@RANGE (2,2));
    MAP SRC.CREDITCARD_PAYMENTS, TARGET TGT.CREDITCARD_PAYMENTS,
    FILTER (@RANGE (2,2));
    MAP SRC.CREDITCARD_PAYMENTS_HISTORY, TARGET TGT.CREDITCARD_
    PAYMENTS_HISTORY, FILTER (@RANGE (2,2));
    MAP SRC.CREDITCARD_PAYMENTS_STATUS, TARGET TGT.CREDITCARD_
    PAYMENTS_STATUS, FILTER (@RANGE (2,2));

Now that we have created the Replicat parameter files and placed them in the dirprm sub-directory, we can create the associated Replicat process groups.

Creating Replicat parallel process groups

  1. Use the following Obey file to prepare and configure your target system for multiple Replicat process groups:
    -- config_target.oby

    -- Login to Database
    dblogin userid ggs_admin@olap, password ggs_admin

    -- Adds Checkpoint Table
    add checkpointtable GGS_ADMIN.GGSCHKPT

    -- Adding the replicat group for the delivery
    ADD REPLICAT replcat1, EXTTRAIL ./dirdat/ta, CHECKPOINTTABLE GGS_
    ADMIN.GGSCHKPT

    -- Adding the replicat group for the delivery
    ADD REPLICAT replcat2, EXTTRAIL ./dirdat/ta, CHECKPOINTTABLE GGS_
    ADMIN.GGSCHKPT

    -- Adding the replicat group for the delivery
    ADD REPLICAT replcat3, EXTTRAIL ./dirdat/tb, CHECKPOINTTABLE GGS_
    ADMIN.GGSCHKPT

    -- Adding the replicat group for the delivery
    ADD REPLICAT replcat4, EXTTRAIL ./dirdat/tb, CHECKPOINTTABLE GGS_
    ADMIN.GGSCHKPT

    -- Starting the replicat groups
    START REPLICAT *

    -- Check status of all running processes
    INFO ALL
  2. Place the config_target.oby file in the dirprm sub-directory of the GoldenGate home on the target system.
  3. To execute the Obey file, call it from the GGSCI command line.
    GGSCI (dbserver2) 1> obey ./dirprm/config_target.oby
  4. Check if the Replicat processes are running.

    GGSCI (dbserver2) 119> info all
    Program Status Group Lag Time Since Chkpt
    MANAGER RUNNING.
    REPLICAT RUNNING REPLCAT1 00:00:00 00:00:06
    REPLICAT RUNNING REPLCAT2 00:00:00 00:00:06
    REPLICAT RUNNING REPLCAT3 00:00:00 00:00:06
    REPLICAT RUNNING REPLCAT4 00:00:00 00:00:06

That concludes the target system configuration. From the example output, we can see that the Replicat processes have been started and are running. You can now enjoy high performance data replication from the OLTP to OLAP databases.

Improving Replicat throughput

Replicat performance can be further improved by altering the way GoldenGate commits the transaction on the target database. By default, Oracle will wait for a commit to succeed before allowing the session to continue. However, this synchronous behavior can cause unnecessary delays when the workload is high.

To alleviate this bottleneck, we can configure our Replicat processes to commit asynchronously at session level by including the following SQLEXEC statement in each parameter file:

SQLEXEC "alter session set commit_wait = 'NOWAIT'";

Note that the specification of the NOWAIT allows a small window of vulnerability. These are as follows:

  • If the database instance crashes, causing the database to lose redo that was buffered but not yet written to the online redo logs
  • A file I/O problem prevents log writer from writing buffered redo to disk

Don't be alarmed; in both cases, GoldenGate will automatically "replay" the uncommitted transactions which would be driven by the information stored in the Checkpoint table, following database instance crash recovery.

Tuning the network

Another key area to focus on is the network. A poorly performing network will cause high latencies and possible disconnections. One easy method to determine whether your GoldenGate implementation is suffering from network delays is to check that the process write checkpoints are increasing at the same rate. For example, if your primary Extract process is check-pointing frequently when compared to the Data Pump process, this indicates that the Data Pump process cannot write to the remote trail quickly enough. So what can be done about it?

We can look at increasing data throughput by adjusting OS TCP socket buffers, TCP packet sizes, transmission queue lengths, and so on. Let's take a look at some of the common network tuning parameters and tools.

Linux TCP tuning

Like most operating systems, the default maximum kernel parameter settings for networking are way too small. For example, Linux kernel 2.6 has a maximum TCP buffer size of 256KB for both send and receive. During high throughput, you will experience dropped packets when the buffer overflows. The protocol will retransmit these, thus incurring a performance overhead.

Oracle recommends setting the following kernel parameters to at least 4MB. In fact the Oracle 11g Universal Installer (OUI) pre-checks these before allowing the installer to continue. These adjustments must be done on both the source and the target database servers.

net.core.rmem_max=4194304
net.core.wmem_max=4194304

For Oracle GoldenGate, however, these setting may need to be increased further. To set these parameters dynamically, edit the system control configuration file as the root user. Then invoke the sysctl –p command to update the kernel as shown:

[root@dbserver1 ~]# vi /etc/sysctl.conf
net.core.rmem_max=8388608
net.core.wmem_max=8388608

[root@dbserver1 ~]# sysctl -p

Setting the TCP queue length to 5000 can also be beneficial for Gigabit Ethernet Network Interface Controllers (NIC). For networks with more than a 50 ms round trip time, a value of 5000-10000 is recommended. To increase the txqueuelen parameter, run the following command as the root user, where eth2 is the name of your NIC device:

[root@dbserver1 ~]# ifconfig eth2 txqueuelen 5000

It is also worth experimenting with NIC flow control and TCP Segmentation Offload if performance is still an issue. Large Segment Offload (LSO) is a technique for increasing outbound throughput of high-bandwidth network connections by reducing the overhead on CPU. This technique is supported by most of today's NICs.

To check the current status of the NIC settings, execute the following commands as root.

[root@ dbserver1 ~]# ethtool -a eth3
Pause parameters for eth2:
Autonegotiate: on
RX: on
TX: on
[root@dbserver1 ~]# ethtool -k eth2
Offload parameters for eth2:
rx-checksumming: on
tx-checksumming: on
scatter-gather: off
tcp segmentation offload: off
udp fragmentation offload: off
generic segmentation offload: off

To enable TCP Segmentation Offload, execute the following command as root. The setting is persistent across server reboots.

[root@dbserver1 ~]# ethtool -K eth2 tso on

Typically, the network between the source and target database servers is tuned first, but the local TNS connection to the database or ASM instance is often overlooked. For example, when using ASM as your storage solution, we can glean up to three times the performance from an Extract process by using a Bequeath connection and not TCP.

Configuring a Bequeath connection

As the Extract process runs locally on the database server, we can exploit the Bequeath connection, thus avoiding the database Listener altogether. TCP connections are managed by the Listener, whereas BEQ connections access the redo logs directly. It is possible to read up to 1MB per read operation from ASM. When TNS is used, the TCP layer will "chop up" the data into packets incurring the additional performance overhead.

The following steps describe how to configure a Bequeath connection to your ASM instance that will be used by the GoldenGate Extract process.

  1. Using a text editor, add a TNS entry, similar to the following example, for the ASM instance, on your source database server:

    vi $ORACLE_HOME/network/admin/tnsnames.ora file

    ASM =
    (DESCRIPTION =
    (ADDRESS = (PROTOCOL = BEQ)
    (PROGRAM = /u01/app/oracle/product/11.1.0/asm/bin/oracle)
    (ARGV0 = oracle+ASM1)
    (ARGS = '(DESCRIPTION=(LOCAL=YES)(ADDRESS=(PROTOCOL=BEQ)))')
    (ENVS = 'ORACLE_HOME=/u01/app/oracle/product/11.1.0/
    asm,ORACLE_SID=+ASM1')
    )
    (CONNECT_DATA =
    (SERVICE_NAME = +ASM)
    (INSTANCE_NAME = +ASM1)
    )
    )

  2. Open a terminal session to the server as the Oracle user and log on to the ASM instance using SQL*Plus with ASM TNS alias.
    [oracle@dbserver1 ~]$ sqlplus sys/password@ASM as sysasm
  3. Run the following query to identify the OS process ID.

    SQL >select substr(s.sid,1,3) sid,substr(s.serial#,1,5) ser,
    2 substr(osuser,1,8) osuser,spid ospid,
    3 substr(status,1,3) stat,substr(command,1,3) com,
    4 substr(schemaname,1,10) schema,
    5 substr(type,1,3) typ
    6 from v$process p, v$SESSTAT t,v$sess_io i ,v$session s
    7 where i.sid=s.sid and p.addr=paddr(+) and s.sid=t.sid and
    8 t.statistic#=12
    9 and s.PROGRAM like '%sqlplus%';
    SID SER OSUSER OSPID STAT COM SCHEMA TYP
    ---- ----- ------ ---- ---- --- --- ---
    483 24067 oracle 25322 ACT 3 SYS USE

  4. Open a new terminal session to the same server and grep for the OS PID identified by the above query. The output confirms we are using the Bequeath connection to the +ASM1 instance.
    [oracle@dbserver1 ~]$ ps -ef | grep 25322
    oracle 25322 1 0 08:24 ? 00:00:00 oracle+ASM1
    (DESCRIPTION=(LOCAL=YES)(ADDRESS=(PROTOCOL=BEQ)))
  5. Ensure you have the following parameter configured in your Extract parameter file:
    TRANLOGOPTIONS ASMUSER SYS@ASM, ASMPASSWORD Password

Summary

Computer System functionality is only as good as its performance. We have all experienced a slow user interface and given up waiting for a response, deferring to a later time when the system is less busy. Although Oracle GoldenGate has a reputation for being fast and efficient, we have learnt it may still require extensive tuning depending on the requirements and data volumes involved. From parallel processing configuration to tuning Linux kernel parameters, this article has provided the approach that should be adopted when implementing GoldenGate on Oracle 11g. We have also discussed the performance enhancements that are available in the latest release of GoldenGate, including tips and tricks associated with them.


Further resources on this subject:


Oracle GoldenGate 11g Implementer's guide Design, install, and configure high-performance data replication solutions with this Oracle GoldenGate book and eBook
Published: February 2011
eBook Price: $32.99
Book Price: $54.99
See more
Select your format and quantity:

About the Author :


John P Jeffries

Originally from a development background, John P Jeffries has worked for Oracle Corporation since the Siebel acquisition in 2005. His time at Siebel was spent developing ETL solutions for Data Warehouses, later moving to Oracle Consulting and then Oracle Advanced Customer Services as a Senior Principal Consultant, earning the internal title of "The UK's Data Replication Expert". Now a freelance consultant, he can be found onsite in many of the world's most respected financial institutions in London, consulting on Oracle GoldenGate, Streams, and Active Data Guard. With over 15 years of Oracle experience, and an OCP since Oracle 8i, he has extensive knowledge of Oracle databases, including Enterprise Linux and RAC, coupled with the ability to design and build high-performance distributed database systems. He has trained internal and external clients in Data Warehousing and Data Replication techniques, and continues to share his knowledge and experience through his own Website: http://www.oracle11ggotchas.com.

Read Packt's recent Q&A with John for further information

Books From Packt


Oracle JRockit: The Definitive Guide
Oracle JRockit: The Definitive Guide

Getting Started with Oracle BPM Suite 11gR1 – A Hands-On Tutorial
Getting Started with Oracle BPM Suite 11gR1 – A Hands-On Tutorial

Oracle Database 11g – Underground Advice for Database Administrators
Oracle Database 11g – Underground Advice for Database Administrators

Oracle Coherence 3.5
Oracle Coherence 3.5

Oracle SOA Suite 11g R1 Developer's Guide
Oracle SOA Suite 11g R1 Developer's Guide

Oracle Database 11g R2 Performance Tuning Cookbook: RAW
Oracle Database 11g R2 Performance Tuning Cookbook: RAW

Oracle APEX 4.0 Cookbook
Oracle APEX 4.0 Cookbook

Getting Started With Oracle SOA Suite 11g R1 – A Hands-On Tutorial
Getting Started With Oracle SOA Suite 11g R1 – A Hands-On Tutorial


No votes yet

Post new comment

CAPTCHA
This question is for testing whether you are a human visitor and to prevent automated spam submissions.
8
F
p
7
E
s
Enter the code without spaces and pay attention to upper/lower case.
Code Download and Errata
Packt Anytime, Anywhere
Register Books
Print Upgrades
eBook Downloads
Video Support
Contact Us
Awards Voting Nominations Previous Winners
Judges Open Source CMS Hall Of Fame CMS Most Promising Open Source Project Open Source E-Commerce Applications Open Source JavaScript Library Open Source Graphics Software
Resources
Open Source CMS Hall Of Fame CMS Most Promising Open Source Project Open Source E-Commerce Applications Open Source JavaScript Library Open Source Graphics Software