Home Data Oracle GoldenGate 11g Implementer's guide

Oracle GoldenGate 11g Implementer's guide

By John P Jeffries , John P Jeffries
books-svg-icon Book
eBook $36.99 $24.99
Print $60.99
Subscription $15.99 $10 p/m for three months
$10 p/m for first 3 months. $15.99 p/m after that. Cancel Anytime!
What do you get with a Packt Subscription?
This book & 7000+ ebooks & video courses on 1000+ technologies
60+ curated reading lists for various learning paths
50+ new titles added every month on new and emerging tech
Early Access to eBooks as they are being written
Personalised content suggestions
Customised display settings for better reading experience
50+ new titles added every month on new and emerging tech
Playlists, Notes and Bookmarks to easily manage your learning
Mobile App with offline access
What do you get with a Packt Subscription?
This book & 6500+ ebooks & video courses on 1000+ technologies
60+ curated reading lists for various learning paths
50+ new titles added every month on new and emerging tech
Early Access to eBooks as they are being written
Personalised content suggestions
Customised display settings for better reading experience
50+ new titles added every month on new and emerging tech
Playlists, Notes and Bookmarks to easily manage your learning
Mobile App with offline access
What do you get with eBook + Subscription?
Download this book in EPUB and PDF formats, plus a monthly download credit
This book & 6500+ ebooks & video courses on 1000+ technologies
60+ curated reading lists for various learning paths
50+ new titles added every month on new and emerging tech
Early Access to eBooks as they are being written
Personalised content suggestions
Customised display settings for better reading experience
50+ new titles added every month on new and emerging tech
Playlists, Notes and Bookmarks to easily manage your learning
Mobile App with offline access
What do you get with a Packt Subscription?
This book & 6500+ ebooks & video courses on 1000+ technologies
60+ curated reading lists for various learning paths
50+ new titles added every month on new and emerging tech
Early Access to eBooks as they are being written
Personalised content suggestions
Customised display settings for better reading experience
50+ new titles added every month on new and emerging tech
Playlists, Notes and Bookmarks to easily manage your learning
Mobile App with offline access
What do you get with eBook?
Download this book in EPUB and PDF formats
Access this title in our online reader
DRM FREE - Read whenever, wherever and however you want
Online reader with customised display settings for better reading experience
What do you get with video?
Download this video in MP4 format
Access this title in our online reader
DRM FREE - Watch whenever, wherever and however you want
Online reader with customised display settings for better learning experience
What do you get with video?
Stream this video
Access this title in our online reader
DRM FREE - Watch whenever, wherever and however you want
Online reader with customised display settings for better learning experience
What do you get with Audiobook?
Download a zip folder consisting of audio files (in MP3 Format) along with supplementary PDF
What do you get with Exam Trainer?
Flashcards, Mock exams, Exam Tips, Practice Questions
Access these resources with our interactive certification platform
Mobile compatible-Practice whenever, wherever, however you want
BUY NOW $10 p/m for first 3 months. $15.99 p/m after that. Cancel Anytime!
eBook $36.99 $24.99
Print $60.99
Subscription $15.99 $10 p/m for three months
What do you get with a Packt Subscription?
This book & 7000+ ebooks & video courses on 1000+ technologies
60+ curated reading lists for various learning paths
50+ new titles added every month on new and emerging tech
Early Access to eBooks as they are being written
Personalised content suggestions
Customised display settings for better reading experience
50+ new titles added every month on new and emerging tech
Playlists, Notes and Bookmarks to easily manage your learning
Mobile App with offline access
What do you get with a Packt Subscription?
This book & 6500+ ebooks & video courses on 1000+ technologies
60+ curated reading lists for various learning paths
50+ new titles added every month on new and emerging tech
Early Access to eBooks as they are being written
Personalised content suggestions
Customised display settings for better reading experience
50+ new titles added every month on new and emerging tech
Playlists, Notes and Bookmarks to easily manage your learning
Mobile App with offline access
What do you get with eBook + Subscription?
Download this book in EPUB and PDF formats, plus a monthly download credit
This book & 6500+ ebooks & video courses on 1000+ technologies
60+ curated reading lists for various learning paths
50+ new titles added every month on new and emerging tech
Early Access to eBooks as they are being written
Personalised content suggestions
Customised display settings for better reading experience
50+ new titles added every month on new and emerging tech
Playlists, Notes and Bookmarks to easily manage your learning
Mobile App with offline access
What do you get with a Packt Subscription?
This book & 6500+ ebooks & video courses on 1000+ technologies
60+ curated reading lists for various learning paths
50+ new titles added every month on new and emerging tech
Early Access to eBooks as they are being written
Personalised content suggestions
Customised display settings for better reading experience
50+ new titles added every month on new and emerging tech
Playlists, Notes and Bookmarks to easily manage your learning
Mobile App with offline access
What do you get with eBook?
Download this book in EPUB and PDF formats
Access this title in our online reader
DRM FREE - Read whenever, wherever and however you want
Online reader with customised display settings for better reading experience
What do you get with video?
Download this video in MP4 format
Access this title in our online reader
DRM FREE - Watch whenever, wherever and however you want
Online reader with customised display settings for better learning experience
What do you get with video?
Stream this video
Access this title in our online reader
DRM FREE - Watch whenever, wherever and however you want
Online reader with customised display settings for better learning experience
What do you get with Audiobook?
Download a zip folder consisting of audio files (in MP3 Format) along with supplementary PDF
What do you get with Exam Trainer?
Flashcards, Mock exams, Exam Tips, Practice Questions
Access these resources with our interactive certification platform
Mobile compatible-Practice whenever, wherever, however you want
  1. Free Chapter
    Getting Started
About this book
Data replication is an important part of any database system that is growing due to today's demand for real-time reporting and regulatory requirements. GoldenGate has recently become Oracle's strategic real-time data replication solution. Until now, very little has been written about how to implement GoldenGate in a production enterprise environment where performance, scalability, and data integrity are paramount. Your days of dismay over the lack of documentation over Oracle GoldenGate are over. Welcome to Oracle GoldenGate 11g Implementer's guide – a comprehensive practical book, which will deliver answers to your questions in a clear, concise style, allowing you to progress effectively in a timeline-driven environment. Based on the author's own experience, this long awaited GoldenGate administration book has all that is required to install, design, configure, and tune data replication solutions suited to every environment. Be the first to master GoldenGate's power and flexibility by reading this unique hands-on implementation companion. Systems need to send data from one system to another in a timely manner to satisfy the ever-increasing need for speed. Regardless of whether you are a novice or an expert – or someone in between – this book will guide you through all the steps necessary to build a high-performance GoldenGate solution on Oracle11gR1. Expert users can dive into key topic areas such as performance tuning or troubleshooting, while novice users can step through the early installation and configuration chapters, later progressing to the advanced chapters. This book is more than an implementation guide. It offers detailed real-life examples, encouraging additional thought and discussion by going beyond the manual. With Oracle GoldenGate 11g Implementer's guide in hand, you'll be designing, installing, and configuring high-performance solutions using GoldenGate in less time than you can say "replicate"
Publication date:
February 2011
Publisher
Packt
Pages
280
ISBN
9781849682008

 

Chapter 1. Getting Started

The objective of this chapter is to get you started using Oracle GoldenGate 10.4. We will discuss the history and evolution of GoldenGate Software, its success in the market and ultimate Oracle acquisition. You will become accustomed with the concepts of data replication and how GoldenGate provides enterprise-wide solutions to address the business requirements.

Although an introduction, this chapter is designed to inspire thought by drilling into the key components, processes, and considerations required to build and implement a GoldenGate solution.

In this chapter, we will discuss the following points surrounding GoldenGate:

  • The evolution of GoldenGate software

  • The technology and architecture

  • The solutions offered by GoldenGate

  • The architecture and topology of GoldenGate, plus design considerations

  • The supported platform and database versions

Let's begin by learning what GoldenGate is and what you can expect from this book.

 

What is GoldenGate?


Oracle GoldenGate is Oracle's strategic solution for real time data integration. GoldenGate software enables mission critical systems to have continuous availability and access to real-time data. It offers a fast and robust solution for replicating transactional data between operational and analytical systems.

Oracle GoldenGate captures, filters, routes, verifies, transforms, and delivers transactional data in real-time, across Oracle and heterogeneous environments with very low impact and preserved transaction integrity. The transaction data management provides read consistency, maintaining referential integrity between source and target systems.

This book aims to illustrate through example, providing the reader with solid information and tips for implementing GoldenGate software in a production environment.

In this book, we will not be making direct comparisons between Oracle GoldenGate and Oracle Streams. At the time of writing, Oracle is leveraging the advantages of GoldenGate by enhancing the product whilst continuing to fully support Streams.

As a competitor to Oracle GoldenGate, data replication products and solutions exist from other software companies and vendors. These are mainly storage replication solutions that provide fast point in time data restoration. The following is a list of the most common solutions available today:

  • EMC SRDF and EMC RecoverPoint

  • IBM PPRC and Global Mirror (known together as IBM Copy Services)

  • Hitachi TrueCopy

  • Hewlett-Packard Continuous Access (HP CA)

  • Symantec Veritas Volume Replicator (VVR)

  • DataCore SANsymphony and SANmelody

  • FalconStor Replication and Mirroring

  • Compellent Remote Instant Replay

Data replication techniques have improved enormously over the past 10 years and have always been a requirement in nearly every IT project in every industry. Whether for Disaster Recovery (DR), High Availability (HA), Business Intelligence (BI), or even regulatory reasons, the requirements and expected performance have also increased, making the implementation of efficient and scalable data replication solutions a welcome challenge.

 

Oracle GoldenGate evolution


GoldenGate Software Inc was founded in 1995. Originating in San Francisco, the company was named after the famous Golden Gate Bridge by its founders, Eric Fish and Todd Davidson. The tried and tested product that emerged quickly became very popular within the financial industry. Originally designed for the fault tolerant Tandem computers, the resilient and fast data replication solution was in demand. The banks initially used GoldenGate software in their ATM networks for sending transactional data from high street machines to mainframe central computers. The data integrity and guaranteed zero data loss is obviously paramount and plays a key factor. The key architectural properties of the product are as follows:

  • Data is sent in "real time" with sub-second speed.

  • Supports heterogeneous environments across different database and hardware types. "Transaction aware" —maintaining its read-consistent and referential integrity between source and target systems.

  • High performance with low impact; able to move large volumes of data very efficiently while maintaining very low lag times and latency.

  • Flexible modular architecture.

  • Reliable and extremely resilient to failure and data loss. No single point of failure or dependencies, and easy to recover.

Oracle Corporation acquired GoldenGate Software in September 2009. Today there are more than 500 customers around the world using GoldenGate technology for over 4000 solutions, realizing over $100 million in revenue for Oracle.

 

Oracle GoldenGate solutions


Oracle GoldenGate provides five data replication solutions:

  1. High Availability

    • Live Standby for an immediate fail-over solution that can later re-synchronize with your primary source.

    • Active-Active solutions for continuous availability and transaction load distribution between two or more active systems.

  2. Zero-Downtime Upgrades and Migrations

    • Eliminates downtime for upgrades and migrations.

  3. Live Reporting

    • Feeding a reporting database so as not to burden the source production systems with BI users or tools.

  4. Operational Business Intelligence (BI)

    • Real-time data feeds to operational data stores or data warehouses, directly or via Extract Transform and Load (ETL) tools.

  5. Transactional Data Integration

    • Real-time data feeds to messaging systems for business activity monitoring, business process monitoring, and complex event processing.

    • Uses event-driven architecture and service-oriented architecture (SOA).

The following diagram shows the basic architecture for the various solutions available from GoldenGate software:

We have discovered there are many solutions where GoldenGate can be applied. Now we can dive into how GoldenGate works, the individual processes, and the data flow that is adopted for all.

 

Oracle GoldenGate technology overview


Let's take a look at GoldenGate's fundamental building blocks; the Capture process, Trail files, Data pump, Server collector, and Apply processes. In fact, the order in which the processes are listed depicts the sequence of events for GoldenGate data replication across distributed systems. A Manager process runs on both the source and the target systems that "oversee" the processing and transmission of data.

All the individual processes are modular and can be easily decoupled or combined to provide the best solution to meet the business requirements. It is normal practice to configure multiple Capture and Apply processes to balance the load and enhance performance. You can read more about this in Chapter 9, Performance Tuning.

Filtering and transformation of the data can be done at either the source by the Capture or at the target by the Apply processes. This is achieved through parameter files, and is explained in detail in Chapter 3, Configuring Oracle GoldenGate.

The capture process (Extract)

Oracle GoldenGate's capture process, known as Extract, obtains the necessary data from the databases' transaction logs. For Oracle, these are the online redo logs that contain all the data changes made in the database. GoldenGate does not require access to the source database and only extracts the committed transactions from the online redo logs. It can however, read archived redo logs to extract the data from long running transactions, but more about that later in the book.

The Extract process will regularly checkpoint its read and write position, typically to a file. The checkpoint data insures GoldenGate can recover its processes without data loss in the case of failure.

The Extract process can have one the following statuses:

  • STOPPED

  • STARTING

  • RUNNING

  • ABENDED

The ABENDED status stems back to the Tandem computer, where processes either stop (end normally) or abend (end abnormally). Abend is short for abnormal end.

Trail files

To replicate transactional data efficiently from one database to another, Oracle GoldenGate converts the captured data into a Canonical Format which is written to trail files, both on the source and the target system. The provision of source and target trail files in the GoldenGates architecture eliminates any single point of failure and ensures data integrity is maintained. A dedicated checkpoint process keeps track of the data being written to the trails on both the source and target for fault tolerance.

It is possible to configure GoldenGate not to use trail files on the source system and write data directly from the database's redo logs to the target server data collector. In this case, the Extract process sends data in large blocks across a TCP/IP network to the target system. However, this configuration is not recommended due to the possibility of data loss occurring during unplanned system or network outages. Best practice states, the use of local trail files would provide a history of transactions and support the recovery of data for retransmission via a Data Pump.

Data Pump

When using trail files on the source system, known as a local trail, GoldenGate requires an additional Extract process called Data Pump that sends data in large blocks across a TCP/IP network to the target system. As previously sated, this is best practice and should be adopted for all Extract configurations.

Server Collector

The Server Collector process runs on the target system and accepts data from the source (Extract/Data Pump). Its job is to reassemble the data and write it to a GoldenGate trail file, known as a remote trail.

The Apply process (Replicat)

The Apply process, known in GoldenGate as Replicat, is the final step in the data delivery. It reads the trail file and applies it to the target database in the form of DML (deletes, updates and inserts) or DDL*. (database structural changes). This can be concurrent with the data capture or performed later.

The Replicat process will regularly checkpoint its read and write position, typically to a file. The checkpoint data ensures that GoldenGate can recover its processes without data loss in the case of failure.

The Replicat process can have one of the following statuses:

  • STOPPED

  • STARTING

  • RUNNING

  • ABENDED

* DDL is only supported in unidirectional configurations and non-heterogeneous (Oracle to Oracle) environments.

The Manager process

The Manager process runs on both source and target systems. Its job is to control activities such as starting, monitoring, and restarting processes; allocating data storage; and reporting errors and events. The Manager process must exist in any GoldenGate implementation. However, there can be only one Manager process per Changed Data Capture configuration on the source and target.

The Manager process can have either of the following statuses:

  • STOPPED

  • RUNNING

GGSCI

In addition to the processes previously described, Oracle GoldenGate 10.4 ships with its own command line interface known as GoldenGate Software Command Interface (GGSCI). This tool provides the administrator with a comprehensive set of commands to create, configure, and monitor all GoldenGate processes. You will become very familiar with GGSCI as you continue through this book.

Oracle GoldenGate 10.4 is command-line driven. However, there is a product called Oracle GoldenGate Director that provides a GUI for configuration and management of your GoldenGate environment.

Process data flow

The following diagram illustrates the GoldenGate processes and their dependencies. The arrows largely depict replicated data flow (committed transactions), apart from checkpoint data and configuration data. The Extract and Replicat processes periodically checkpoint to a file for persistence. The parameter file provides the configuration data. As described in the previous paragraphs, two options exist for sending data from source to target; these are shown as broken arrows:

Having discovered all the processes required for GoldenGate to replicate data, let's now dive a little deeper into the architecture and configurations.

 

Oracle GoldenGate architecture


So what makes GoldenGate different from other data replication products? The quick answer is the architecture. GoldenGate can achieve heterogeneous and homogeneous-real-time transactional Change Data Capture and integration by decoupling itself from the database architecture. This in itself provides a performance boost as well as flexibility through its modular components.

A number of system architecture solutions are offered for data replication and synchronization:

  • One-to-one (source to target)

  • One-to-many (one source to many targets)

  • Many to one (hub and spoke)

  • Cascading

  • Bi-directional (active active)

  • Bi-directional (active passive)

No one configuration is better than another. The one you choose is largely dependent on your business requirements.

One-to-One

By far the simplest and most common configuration is the "source to target". Here we are performing real-time or batch change data replication between two sites in a unidirectional fashion. This could be, for example, between a primary and standby site for Disaster Recovery (DR) or an OLTP to data warehouse for Business Intelligence (BI) and OLAP.

One-to-One architecture provides a data replication solution that offers the following key benefits:

  • Live reporting

  • Fastest possible recovery and switchover (when the target is synchronized with the source)

  • Backup site that can be used for reporting

  • Supports DDL replication

Due to its simplicity, the One-to-One architecture is referred to many times in this book to effectively demonstrate:

  • Process configuration

  • Data transformation

  • Troubleshooting techniques

  • Performance tuning tips and tricks

One-to-Many

Another popular GoldenGate configuration is the One-to-Many architecture. This architecture lends itself perfectly to provide two solutions. One data replication feed for reporting and one for backup and DR. The following example helps to illustrate the method.

One-to-Many architecture provides a data replication solution that offers the following key benefits:

  • Dedicated site for Live reporting.

  • Dedicated site for backup data from source database.

  • Fastest possible recovery and switchover, when using a dedicated backup site. It minimizes logical data corruption as the backup database is separate from the read-write OLAP database.

The One-to-Many architecture is very flexible, given that it provides two solutions in one—a reporting and a standby database, both of which can have different table structures.

Many-to-One

The Many-to-One configuration comes into play for peripheral sites updating a central computer system representing a hub and spokes on a wheel. This scenario is common in all industries, from retail outlets taking customer orders to high street bank branches processing customer transactions. Ultimately, the data needs to make it to the central database ASAP and cannot become lost or corrupted. GoldenGate's architecture lends itself perfectly to this scenario, as seen in the next example. Here we have three spoke sites sending data to the central hub site.

One important point to mention here is Conflict Handling. In a "hub and spoke" configuration, with concurrent updates taking place, data conflicts are highly likely to occur. Should the same database table row or field be updated by more than one source, on the target the conflict must be handled by GoldenGate to allow either one of the transactions to succeed or to fail all.

Another "hub and spoke" solution includes the One-to-Many configuration. A typical example being the company head office sending data to its branches. Here, conflict handling is less of an issue.

Cascading

The cascading architecture offers data replication at n sites, originating from a single source. As the data flows from the originating source database, parts or all of it are "dropped off" at each site in a cascading fashion until the final target is populated. In the following example, we have one source (Site A) and three targets (Sites B, C and D). Intermediate Sites B and C have both source and target trails, whereas Site A has only a source and Site D only a target trail.

What data to replicate is configured by using Filters in the GoldenGate parameter files at each target site, making the Cascade architecture one of the most powerful, yet complex configurations. Users at each site input data that can also be replicated to the next site.

Bi-directional (Active-Active)

The following is an example of an active-active configuration, where Site A sends changed data to Site B and vice versa. Again, Conflict Handling is an important consideration. A conflict is likely to occur in a bi-directional environment, where the same row or field is being updated at both sites. When the change is replicated, a conflict occurs. This needs to be resolved by GoldenGate based on the business rules, that is, should data from Site B overwrite Site A, or should both transactions fail?

Bi-directional (active-active) architecture provides a data replication solution that offers the following key benefits:

  • High availability

  • Transaction load distribution

  • Performance scalability

Another key element to include in your configuration is Loop Detection. We do not want data changes going round in a loop, where Site A updates Site B, then Site B updates Site A, and so on.

Do not be put off by the Bi-directional architecture. When configured correctly, this architecture offers the most appropriate solution for global companies and organizations, allowing users in two centers, both sides of the globe to share the same system and data.

The active-active configuration is very different from the active-passive, which we discuss in the following section.

Bi-directional (Active-Passive)

The following is an example of an active-passive configuration, sometimes called "Live Standby", where Site A sends changed data to Site B only. You'll notice that the path from Site B to Site A is "grayed-out", suggesting that the data replication path can be re-enabled at short notice. This means that the GoldenGate processes exist and are configured, but have not been started.

Bi-directional (active-passive) architecture provides a data replication solution that offers the following key benefits:

  • Both sites have database open read-write

  • Fastest possible recovery and switchover

  • Reverse direction data replication ready

The active-passive configuration lends itself to being a DR solution, supporting a backup site should processes fail on the production site.

 

Supported platforms and databases


As this book is Oracle centric, below is a list of certified platforms and Oracle databases that officially support GoldenGate 10.4. The full comprehensive list for all certified platforms and databases is available at the My Oracle Support Website: https://support.oracle.com. (formerly, Metalink).

Follow the steps below to obtain the official Oracle Certification Matrix:

  1. Log on to your My Oracle Support account and click on the Knowledge tab.

  2. In the Search Product box near the top left of the page, type Oracle GoldenGate and click on the magnifying glass icon.

  3. In the next search box type "Certification Matrix" and click on the magnifying glass icon to display the document ID 9762871.

Although GoldenGate supports earlier versions of Oracle, the following table lists the platforms supported by GoldenGate for Oracle 11g:

Supported platforms for Oracle Database 11g

DB

Version

Architecture

OS

Version

Platform

Oracle

11.1

64

AIX

5.3

IBM PowerPC

Oracle

11.1

32

AIX

5.3

IBM PowerPC

Oracle

11.1

64

AIX

6.1

IBM PowerPC

Oracle

11.1 / 11.2

64

HP-UX

11.23

HP Intel IA64

Oracle

11.1 / 11.2

64

HP-UX

11.31

HP Intel IA64

Oracle

11.1 / 11.2

64

RedHat AS

4

AMD/Intel x64

Oracle

11.1 / 11.2

32

RedHat AS

4

Intel x86

Oracle

11.1 / 11.2

64

RedHat AS

5

AMD/Intel x64

Oracle

11.1 / 11.2

64

Solaris

10

Sun SPARC

Oracle

11.1 / 11.2

64

Windows

2003

AMD/Intel x64

Tip

When downloading the GoldenGate software from Oracle Websites, ensure you choose the correct GoldenGate version, supported platform, architecture (32 or 64 bit) and database type and version.

For example, once unzipped and extracted, the following tar file installs GoldenGate 10.4.0.19 for Oracle 11g on Redhat or Oracle Enterprise Linux 32bit:

ggs_redhatAS50_x86_ora11g_32bit_v10.4.0.19_002.tar

The following is a list of certified non-Oracle databases that support Oracle GoldenGate 10.4:

Supported non-Oracle databases

Database

DB Version

IBM DB2 UDB

8.1

IBM DB2 UDB

8.2

IBM DB2 UDB

9.1 / 9.5

Microsoft SQL Server

2000

Microsoft SQL Server

2005

Microsoft SQL Server

2008 Delivery

MySQL

4.1

MySQL

5

Sybase ASE

12.5.4

Sybase ASE

15

Teradata

V2R5

Teradata

V2R6

SQL/MX

2.3

SQL/MP

N/A

TimesTen

7.0.5

Enscribe

N/A

Oracle GoldenGate is ideal for heterogeneous environments by replicating and integrating data across differing vendor systems. Log-based Change Data Capture (CDC) is supported for nearly all major database vendors. GoldenGate can also integrate with JMS-based messaging systems to enable event driven architecture (EDA) and to support service oriented architecture (SOA). Further, integration support includes Extract Transformation and Load (ETL) products for OLAP and Data Warehouse implementations.

 

Oracle Goldengate topology


The Oracle GoldenGate topology is a representation of the databases in a GoldenGate environment, the GoldenGate components configured on each server, and the flow of data between these components.

The flow of data in separate trails is read, written, validated and check-pointed at each stage. GoldenGate is written in C and because it is native to the operating system, can run extremely fast. The sending, receiving, and validation have very little impact on the overall machine performance. Should performance become an issue due to the sheer volumes of data being replicated, you may consider configuring parallel Extract and/or Replicat processes.

Process topology

The following sections describe the process topology. Firstly discussing the rules that you must adhere to when implementing GoldenGate, followed by the order in which the processes must execute for end to end data replication.

The rules

When using parallel Extract and/or Replicat processes, ensure you keep related DDL and DML together in the same process group to ensure data integrity. The topology rules for configuring the processes are as follows:

  • All objects that are relational to an object are processed by the same group as the parent object

  • All DDL and DML for any given database object are processed by the same Extract group and by the same Replicat group

Should a referential constraint exist between tables, the child table with the foreign key must be included in the same Extract and Replicat group as the parent table having the primary key.

If an Extract group writes to multiple trails that are read by different Replicat groups, the Extract process sends all of the DDL to all of the trails. It is therefore necessary to configure each Replicat group to filter the DDL accordingly.

Position

The following tables show the position of each link in the process topology for the two fundamental configuration types:

Change Data Capture and Delivery using a Data Pump

Start Component

End Component

Position

Extract Process

Local Trail File

1

Local Trail File

Data Pump

2

Data Pump

Server Collector

3

Server Collector

Remote Trail File

4

Remote Trail File

Replicat Process

5

Change Data Capture and Delivery without using a Data Pump

Start Component

End Component

Position

Extract Process

Server Collector

1

Server Collector

Remote Trail File

2

Remote Trail File

Replicat Process

3

Statistics

In terms of performance monitoring, the GGSCI tool provides real-time statistics as well as comprehensive reports for each process configured in the GoldenGate topology. In addition to reporting on demand, it is also possible to schedule reports to be run. This can be particularly useful when performance tuning a process for a given load and period.

The INFO ALL command provides a comprehensive overview of process status and lag, whereas the STATS option gives more detail on the number of operations. Both commands offer real-time reporting. This is demonstrated in the following screen shots:

From the screenshot you can see that the STATS command provides daily and hourly cumulative statistics for a given process, including the overall total and the latest real-time figures.

 

Design considerations


The first thing to consider and probably one of the most important steps in any IT project is the design. If you get this wrong, your system will neither perform nor be scalable, and ultimately the project will fail. The next project may be to address all the design issues and start again from scratch! Not ideal.

So how do you design our GoldenGate implementation? Where do you start? What is important in the design? There are obviously lots of questions, so let's try and answer them.

Choosing a solution

You have already seen the different solutions GoldenGate has to offer at the beginning of this chapter. You need to choose the most appropriate architecture based on the business requirements. To do this it is necessary to first understand the requirements of the system and what the system has to achieve. These requirements' are both functional and non-functional. Examples of non-functional requirements are performance and scalability.

To address the functional requirements you need to know:

  • The overall system architecture and all of its components and interfaces. Ask yourself the question "what data do we need to replicate and to where?"

For the non-functional requirements, you need to know:

  • The maximum latency that is acceptable. Again, ask yourself the question "how far behind the source can the target system(s) be?"

These are all important factors when considering a design. In the earlier section "Oracle GoldenGate Topology" in this chapter, we mentioned the use of parallel Extract and Replicate processes to increase data throughput. The number of parallel trails is largely dependent on the hardware footprint. How many CPU cores do I have? How much memory is available? Etc.

Network

Other areas to consider are the Network and Database Schema design. Starting with the Network, this is fundamental to a data replication solution. If you have a slow network, you will not be able to replicate high volumes of data in real-time. Furthermore, should your network be unreliable, you need to consider the cost of retransmission or transmitting a backlog of trail files. Redundant networks are very important too and can help to alleviate this problem. If you can avoid the network outage altogether by routing data over a backup network, it will save a number of problems.

Database schema

Database Schema design is another important consideration. Imagine a schema where every table is related to nearly every other table, and the cascading referential constraints are so complex, that it would be impossible to logically separate groups of related tables for data extract. GoldenGate does provide a solution to this problem by using the @RANGE function. However, this is not ideal. Apart from the complex configuration, GoldenGate has to spend more CPU processing the configuration filters and artificially "splitting" the data into a pre-defined number of trails. A good schema design would be to ensure that logical separation exists between table groups, allowing a simple, effective configuration that performs well. The number of table groups being directly proportional to the number of Extract processes configured.

 

What to Replicate?


Another key decision in any GoldenGate implementation is what data to replicate. There is little point replicating data that doesn't need to be, as this will cause unnecessary additional overhead. Furthermore, if you decide that you need to replicate everything, GoldenGate may not necessarily provide the best solution. Other products such as Oracle 11g Active Data Guard may be more appropriate. The forthcoming paragraphs talk not only about what to replicate but also how to replicate, plus important functional and design considerations.

Object mapping and data selection

The power of GoldenGate comes into its own when you select what data you wish to replicate, by using its inbuilt tools and functions. You may even wish to transform the data before it hits the target. There are numerous options at your disposal, but choosing the right combination is paramount.

The configuration of GoldenGate includes mapping of source objects to target objects. Given the enormity of parameters and functions available, it is easy to over complicate your GoldenGate Extract or Replicat process configuration through redundant operations. Try to keep your configuration as simple as possible, choosing the right parameter, option, or function for the job. Although it is possible to string these together to achieve a powerful solution, this may cause significant additional processing and performance will suffer as a result.

GoldenGate provides the ability to select or filter out data based on a variety of levels and conditions. Typical data mapping and selection parameters are as follows:

  • TABLE/MAP

    • Specifies the source and target objects to replicate. TABLE is used in Extract and MAP in Replicat parameter files.

  • WHERE

    • Similar to the SQL WHERE clause, the WHERE option included with a TABLE or MAP parameter enables basic data filtering.

  • FILTER

    • Provides complex data filtering. The FILTER option can be used with a TABLE or MAP parameter.

  • COLS/COLSEXCEPT

    • The COLS and COLSEXCEPT option allows columns to be mapped or excluded when included with a TABLE or MAP parameter.

Before GoldenGate can extract data from the databases' online redo logs, the relevant data needs to be written to its log files. A number of pre-requisites exist to ensure the changed data can be replicated:

  • Enable supplemental logging.

  • Setting at database level overrides any NOLOGGING operations and ensures all changed data is written to the redo logs.

  • Forces the logging of the full before and after image for updates.

  • Ensure each source table has a primary key.

  • GoldenGate requires a primary key to uniquely identify a row.

  • If the primary key does not exist on the source table, GoldenGate will create its own unique identifier by concatenating all the table columns together. This can be grossly inefficient given the volume of data that needs to be extracted from the redo logs. Ideally, only the primary key plus the changed data (before and after images in the case of an update statement) are required.

It is also advisable to have a primary key defined on your target table(s) to ensure fast lookup when the Replicat recreates and applies the DML statements against the target database. This is particularly important for UPDATE and DELETE operations.

Initial Load

Initial Load is the process of instantiating the objects on the source database, synchronizing the target database objects with the source and providing the starting point for data replication. The process enables "change synchronization" which keeps track of ongoing transactional changes while the load is being applied. This allows users to continue to change data on the source during the Initial Load process.

The Initial Load can be successfully conducted using the following:

  • A database load utility such as import / export or data pump.

  • An Extract process to write data to files in ASCII format. Replicat then applies the files to the target tables.

  • An Extract process to write data to files in ASCII format. SQL*Loader (direct load) can be used to load the data into the target tables.

  • An Extract process that communicates directly with the Replicat process across TCP/IP without using a Collector process or files.

CSN co-ordination

An Oracle database uses the System Change Number (SCN) to keep track of transactions. For every commit, a new SCN is assigned. The data changes including primary key and SCN are written to the databases' online redo logs. Oracle requires these logs for crash recovery, which allows the committed transactions to be recovered (uncommitted transactions are rolled back). GoldenGate leverages this mechanism by reading the online redo logs, extracting the data and storing the SCN as a series of bytes. The Replicat process replays the data in SCN order when applying data changes on the target database. The GoldenGate manuals refer to the SCN as a CSN (Commit Sequence Number).

Trail file format

GoldenGate's Trail files are in Canonical Format. Backed by checkpoint files for persistence, they store the changed data in a hierarchical form including metadata definitions. The GoldenGate software includes a comprehensive utility named Logdump that has a number of commands to search and view the internal file format.

 

Summary


This chapter has provided the foundation for the rest of the book. It covers the key components of GoldenGate including processes, data flow, architecture, topology, configuration, plus performance and design considerations.

We learn that good design reaches far beyond GoldenGate's architecture into the database schema, allowing us to create an efficient and scalable data replication model. We also discussed the importance of Conflict Handling in certain configurations, plus network speed and resilience.

We have now gained a good understanding of what GoldenGate has to offer and are keen to learn more. Some of the available solutions have been discussed inspiring thought for real life implementations. This chapter has also touched upon inter-process dependencies, trail file format and reporting statistics. The subsequent chapters dive a little deeper, giving tangible examples for building enterprise-wide production like environments.

The next chapter starts at the beginning of the GoldenGate implementation, the installation. This includes preparing the environment as well as downloading and unpacking the software.

About the Authors
  • John P Jeffries

    John P Jeffries has worked in many countries around the world, consulting on the Oracle technology. He is interested in different cultures and enjoys teaching others. At present, John lives and works in Singapore and is employed as a subject matter expert at a global bank, delivering enterprise-wide infrastructure design solutions, particularly in the database private cloud and data replication space. Since he wrote his first book, Oracle GoldenGate 11g Implementer's Guide, he has been heavily focused on leveraging Oracle's Fusion Middleware products (including GoldenGate) to address performance, data migration, and data delivery issues. With over 15 years of experience in Oracle, John has worked for a number of global organizations, such as BT, Siebel Systems, Dell, Thomson Reuters, and Oracle Corporation, to name a few. At Oracle, he worked in advanced customer services on key accounts as a senior principal consultant and became the UK data replication expert, often publishing articles on his own website (http://oracle11ggotchas.com/). John is an Oracle Certified Professional and a GoldenGate specialist. He gives presentations at ODTUG conferences and continues to share his real-life hands-on experience through his work. Known for his ability to provide robust, effective solutions and workarounds, John delivers his knowledge, tips, and tricks in his new book, Oracle GoldenGate 12c Implementer's Guide.

    Browse publications by this author
  • John P Jeffries

    John P Jeffries has worked in many countries around the world, consulting on the Oracle technology. He is interested in different cultures and enjoys teaching others. At present, John lives and works in Singapore and is employed as a subject matter expert at a global bank, delivering enterprise-wide infrastructure design solutions, particularly in the database private cloud and data replication space. Since he wrote his first book, Oracle GoldenGate 11g Implementer's Guide, he has been heavily focused on leveraging Oracle's Fusion Middleware products (including GoldenGate) to address performance, data migration, and data delivery issues. With over 15 years of experience in Oracle, John has worked for a number of global organizations, such as BT, Siebel Systems, Dell, Thomson Reuters, and Oracle Corporation, to name a few. At Oracle, he worked in advanced customer services on key accounts as a senior principal consultant and became the UK data replication expert, often publishing articles on his own website (http://oracle11ggotchas.com/). John is an Oracle Certified Professional and a GoldenGate specialist. He gives presentations at ODTUG conferences and continues to share his real-life hands-on experience through his work. Known for his ability to provide robust, effective solutions and workarounds, John delivers his knowledge, tips, and tricks in his new book, Oracle GoldenGate 12c Implementer's Guide.

    Browse publications by this author
Oracle GoldenGate 11g Implementer's guide
Unlock this book and the full library FREE for 7 days
Start now