Search icon
Arrow left icon
All Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Newsletters
Free Learning
Arrow right icon
Oracle Data Guard 11gR2 Administration : Beginner's Guide

You're reading from  Oracle Data Guard 11gR2 Administration : Beginner's Guide

Product type Book
Published in Jun 2013
Publisher Packt
ISBN-13 9781849687904
Pages 404 pages
Edition 1st Edition
Languages

Table of Contents (19) Chapters

Oracle Data Guard 11gR2 Administration Beginner's Guide
Credits
About the Authors
About the Reviewers
www.PacktPub.com
Preface
Pop Quiz Answers
Getting Started Configuring the Oracle Data Guard Physical Standby Database Configuring Oracle Data Guard Logical Standby Database Oracle Data Guard Broker Data Guard Protection Modes Data Guard Role Transitions Active Data Guard, Snapshot Standby, and Advanced Techniques Integrating Data Guard with the Complete Oracle Environment Data Guard Configuration Patching Common Data Guard Issues Data Guard Best Practices Index

Chapter 10. Common Data Guard Issues

Data Guard administrators need to know methods to resolve some specific issues. These issues may originate from configuration changes, misconfiguration, or user errors. Another important point is the use of diagnostic data to identify these issues. Now we'll cover handling the most common of these Data Guard issues and the methods to access and use diagnostic data.

In this chapter, we will discuss the following topics:

  • Recreating the standby control file

  • Dealing with redo transport authentication problems

  • Dealing with UNNAMED datafiles

  • Closing a gap with RMAN incremental backups

  • Fixing NOLOGGING changes in a standby database

  • Turning on Data Guard tracing

  • Gathering diagnostic data

Let's start with renewing the standby control file of a standby database.

Recreating the standby control file


A standby control file essentially keeps the same information of the primary database with the control file, which is the physical structure of the database. It also contains some specific information about the Data Guard, such as whether an archive log sequence is applied or not. A standby control file is mandatory to mount a physical standby database, and we should consider keeping multiple copies of the standby control file, preferably on different disks, which is known as multiplexing.

In some cases, we may want to renew a standby control file by using a newly created one on the primary database. For example, before a switchover it's a good practice to renew the standby control file in order to guarantee that all of the redo, temp file structure, and historical archived log data are the same. In general, this is a three-step operation:

  1. Create a copy of the standby control file from the primary database.

  2. Transfer this standby control file from the primary...

Time for action – recreating the standby control file


This action shows how to renew the standby control file in a Data Guard environment with OMF.

  1. In the primary database, create a backup of the standby control file with the following RMAN statements:

    $rman target /
    Recovery Manager: Release 11.2.0.1.0 - Production on Wed Dec 19 22:18:05 2012
    Copyright (c) 1982, 2009, Oracle and/or its affiliates.  All rights reserved.
    connected to target database: ORCL (DBID=1319333016)
    
    RMAN> BACKUP CURRENT CONTROLFILE FOR STANDBY FORMAT 'standbyctl.bkp';
    Starting backup at 19-DEC-12
    using target database control file instead of recovery catalog
    allocated channel: ORA_DISK_1
    channel ORA_DISK_1: SID=149 device type=DISK
    channel ORA_DISK_1: starting full datafile backup set
    channel ORA_DISK_1: specifying datafile(s) in backup set
    including standby control file in backup set
    channel ORA_DISK_1: starting piece 1 at 19-DEC-12
    channel ORA_DISK_1: finished piece 1 at 19-DEC-12
    piece handle=/u01/app/oracle2...

Dealing with redo transport authentication problems


By default, the SYS user is used for redo transport in Data Guard configurations. Data Guard communication uses password files in the standby databases to authenticate redo transport sessions. If we change the password of the SYS user in the primary database, redo transport sessions will not be authenticated because the password file in the standby site will be outdated. So redo transport will raise the ORA-01017: invalid username/password or ORA-01031: insufficient privileges error. The primary database alert logfile will include the following lines:

Error 1017 received logging on to the standby
PING[ARC0]: Heartbeat failed to connect to standby 'INDIAPS'. Error is 16191.

It can also include the following lines:

ORA-01031: insufficient privileges
PING[ARC2]: Heartbeat failed to connect to standby 'INDIAPS'. Error is 1031.

Time for action – changing the SYS password in a Data Guard environment


The way to change the SYS password without breaking the redo transport service includes copying the primary database's password file to the standby server after changing the password. The following steps show how this can be done:

  1. Stop redo transport from the primary database to the standby database. We can execute the DEFER command to defer the log destination with the ALTER SYSTEM statement:

    SQL> ALTER SYSTEM SET LOG_ARCHIVE_DEST_STATE_2 = 'DEFER';
    
    System altered.
    

    If the Data Guard broker is being used, we can use the following statement:

    DGMGRL> EDIT DATABASE TURKEY_UN SET STATE = 'LOG-TRANSPORT-OFF';
    
  2. Change the SYS user's password in the primary database:

    SQL> ALTER USER SYS IDENTIFIED BY newpassword;
    
    User altered.
    
  3. Copy the primary database's password file to the standby site:

    $ cd $ORACLE_HOME/dbs
    $ scp orapwTURKEY standbyhost:/u01/app/oracle/product/11.2.0/ dbhome_1/dbs/orapwINDIAPS
    
  4. Try logging into the...

Time for action – changing the redo transport user


Follow these steps to change the redo transport user in the Data Guard configuration:

  1. Create a new database, which will be used for redo transport in the primary database. Grant the SYSOPER privileges to this user and ensure that the standby database has applied these changes:

    SQL> CREATE USER DGUSER IDENTIFIED BY SOMEPASSWORD;
    SQL> GRANT SYSOPER to DGUSER;
    

    Tip

    Don't forget that if the password expires periodically for this user, this will pose a problem in Data Guard redo transport. So ensure that the default profile does not include the PASSWORD_LIFE_TIME and PASSWORD_GRACE_TIME settings. If it does, choose another profile for this user.

  2. Stop the redo transport from the primary database to the standby databases. We can execute the DEFER command to defer the log destination with the ALTER SYSTEM statement:

    SQL> ALTER SYSTEM SET LOG_ARCHIVE_DEST_STATE_2 = 'DEFER';
    
  3. Change the redo transport user by setting the REDO_TRANSPORT_ USER...

Dealing with UNNAMED datafiles


There are some reasons for a file being created as UNNAMED in the standby database, including insufficient disk space on the standby site, non-privileged directory structure on standby database, or improper parameter settings related to file management.

The STANDBY_FILE_MANAGEMENT parameter enables or disables automatic standby file management in Data Guard. When automatic standby file management is enabled, file additions and deletions in the primary database are replicated to the standby database.

For example, we add a datafile in the primary database when the STANDBY_FILE_MANAGEMENT parameter on the standby database is set to MANUAL. Due to this parameter setting, it will create an UNNAMED file under $ORACLE_HOME/dbs, and this will cause the MRP process to be killed. Errors in the alert log file will be as follows:

Errors in file /u01/app/oracle2/diag/rdbms/india_ps/INDIAPS/trace/INDIAPS_pr00_691.trc:
ORA-01111: name for data file 10 is unknown - rename to...

Time for action – resolving UNNAMED datafile errors


Now we'll see how to resolve an UNNAMED datafile issue in a Data Guard configuration:

  1. Check for the datafile number that needs to be recovered from the standby database:

    SQL> SELECT * FROM V$RECOVER_FILE WHERE ERROR LIKE '%MISSING%';
    
         FILE# ONLINE  ONLINE_ ERROR                   CHANGE# TIME
    ---------- ------- ------- ----------------- ---------- ----------
           10  ONLINE  ONLINE  FILE MISSING                  0
    
  2. Identify datafile 10 in the primary database:

    SQL> SELECT FILE#,NAME FROM V$DATAFILE WHERE FILE#=10;
    
         FILE# NAME
    ---------- -----------------------------------------------
           536 /u01/app/oracle2/datafile/ORCL/users03.dbf
    
  3. Identify the dummy filename created in the standby database:

    SQL> SELECT FILE#,NAME FROM V$DATAFILE WHERE FILE#=10;
    
         FILE# NAME
    ---------- -------------------------------------------------------
           536 /u01/app/oracle2/product/11.2.0/dbhome_1/dbs/UNNAMED00010
    
  4. If the reason for...

Closing a gap with an RMAN incremental backup


When a standby database falls behind the primary database in time because of any interruption in redo transport or apply, the database can be synchronized again by applying the archived logs produced in the no-synchronization period. However, even if one of the necessary archived logfiles is not accessible, there is nothing to do for Data Guard to close the gap.

In such a case, we have to restore these archived logfiles from backups if they exist. If not, the only way to close the gap is by using an RMAN incremental backup taken from the primary database, especially to close the gap in question. We use the BACKUP INCREMENTAL FROM SCN RMAN statement for this special-purpose backup.

Time for action – closing a gap with an RMAN incremental backup


Let's see all the required steps to practice this recovery operation:

  1. In this practice, assume that there are missing archived logs (gap) in the standby database, and we're not able to restore these archived logs. We'll synchronize Data Guard using the RMAN incremental backup. To represent this situation, execute the DEFER command to defer the log destination in the primary database, and execute the following operation that will generate redo in the primary database:

    SQL> ALTER SYSTEM SET LOG_ARCHIVE_DEST_STATE_2 = 'DEFER';
    
  2. Now we have a standby database behind the primary database, and we'll use RMAN to reflect the primary database's changes to the standby database. Stop Redo Apply in the standby database:

    SQL> ALTER DATABASE RECOVER MANAGED STANDBY DATABASE CANCEL;
    
  3. Query the current system change number (SCN) of the standby database that will be used as the limit for an incremental backup of the primary database. Run...

Fixing NOLOGGING changes on the standby database


It's possible to limit redo generation for specific operations on Oracle databases, which provide higher performance. These operations include bulk inserts, creation of tables as select operations, and index creations. When we work using the NOLOGGING clause, redo will not include all the changes to data on the related segments. This means if we perform a restore/recovery of the related datafile, or of the whole database after the NOLOGGING operations, it'll not be possible to recover the data created with the NOLOGGING option.

The same problem exists with Data Guard. When the NOLOGGING operation is executed in the primary database, Data Guard is not able to reflect all the data changes in the standby database. In this case, when we activate a standby database or open it in the read-only mode, we'll see the following error messages:

ORA-01578: ORACLE data block corrupted (file # 1, block # 2521)
ORA-01110: data file 1: '/u01/app/oracle2/datafile...

Time for action – fixing NOLOGGING changes on a standby database with incremental datafile backups


As a prerequisite for this exercise, first put the primary database in the no-force logging mode using the ALTER DATABASE NO FORCE LOGGING statement. Then perform some DML operations in the primary database using the NOLOGGING clause so that we can fix the issue in the standby database with the following steps:

  1. Run the following query to identify the datafiles that are affected by NOLOGGING changes:

    SQL> SELECT FILE#, FIRST_NONLOGGED_SCN FROM V$DATAFILE WHERE FIRST_NONLOGGED_SCN > 0;
    
    FILE#      FIRST_NONLOGGED_SCN
    ---------- -------------------
             4            20606544
    
  2. First we need to put the affected datafiles in the OFFLINE state in the standby database. For this purpose, stop Redo Apply in the standby database, execute the ALTER DATABASE DATAFILE ... OFFLINE statement, and start Redo Apply again:

    SQL> ALTER DATABASE RECOVER MANAGED STANDBY DATABASE CANCEL;
    SQL> ALTER DATABASE...

Time for action – fixing NOLOGGING changes in the standby database with incremental database backups


  1. Determine the SCN that we'll use in the RMAN incremental database backup by querying the minimum FIRST_NONLOGGED_SCN column of the V$DATAFILE view in the standby database:

    SQL> SELECT MIN(FIRST_NONLOGGED_SCN) FROM V$DATAFILE WHERE FIRST_NONLOGGED_SCN>0;
    
    MIN(FIRST_NONLOGGED_SCN)
    ------------------------
                    20606544
    
  2. Stop Redo Apply on the standby database:

    SQL> ALTER DATABASE RECOVER MANAGED STANDBY DATABASE CANCEL;
    
  3. Now we'll take an incremental backup of the database using the FROM SCN keyword. The SCN value will be the output of the execution of the query in the first step. Connect to the primary database as the RMAN target and execute the following RMAN BACKUP statement:

    RMAN> BACKUP INCREMENTAL FROM SCN 20606344 DATABASE FORMAT '/data/DB_Inc_%U' TAG 'FOR STANDBY';
    
  4. Copy the backup files from the primary site to the standby site with FTP or SCP:

    scp /data/DB_Inc_...

Turning on Data Guard tracing


When database administrators work on a Data Guard problem or plan an important Data Guard operation such as role transition, they generally prefer to gather comprehensive trace information about the activity of Data Guard-related processes. For this purpose, Oracle offers the LOG_ARCHIVE_TRACE parameter. By setting this parameter to an appropriate value, it's possible to have detailed information about log archiving, redo transport, and Redo Apply activities.

The default value of this initialization parameter is 0, which means the additional tracing feature is off, and Oracle will continue generating its default alert and trace entries related to error conditions. It's possible to change the value of this parameter in the primary and/or standby databases online using the ALTER SYSTEM statement. For example, look at the following statement:

SQL> ALTER SYSTEM SET LOG_ARCHIVE_TRACE=15; 

In the Real Application Cluster database it's possible to set different tracing...

Gathering diagnostic data


We need to access diagnostic data about the Data Guard configuration, especially when there's a problem in the Redo Apply or redo transport services. After the first diagnosis of the problem, it's possible to decide whether to search for detailed information in the primary database or in the standby database. If the issue is about sending redo, it's more likely that the necessary information can be found on the primary site. However, if it's about Redo Apply, it's better to search for detailed information on the standby site.

No matter where we search for diagnostic data, we need to know where to search for the related logfiles and how to query diagnostic data in the database. The most commonly referenced files in a Data Guard issue are the primary and standby alert log files. If the Data Guard broker is used in the configuration, Data Guard Monitor (DMON) logfiles can also be helpful for troubleshooting. If necessary, we can also query Data Guard-related dynamic...

Time for action – monitoring the database alert log using ADRCI


Let's see an example of monitoring the database alert log using the ADRCI utility:

  1. Ensure that the ORACLE_HOME and PATH environment variables are set properly. The PATH environment variable must include the ORACLE_HOME/bin directory.

    export ORACLE_HOME=/u01/app/oracle2/product/11.2.0/dbhome_1
    export PATH=$PATH:$ORACLE_HOME/bin
    
  2. Start the ADRCI command-line tool:

    $ adrci
    
    ADRCI: Release 11.2.0.1.0 - Production on Thu Dec 20 02:06:49 2012
    
    Copyright (c) 1982, 2009, Oracle and/or its affiliates.  All rights reserved.
    
    ADR base = "/u01/app/oracle2"
    
  3. We can run the HELP command to get help on the usage of this utility:

    adrci> HELP
    HELP [topic]
       Available Topics:
            CREATE REPORT
            ECHO
            EXIT
            HELP
            HOST
            IPS
            PURGE
            RUN
            SET BASE
            SET BROWSER
            SET CONTROL
            SET ECHO
            SET EDITOR
            SET HOMES | HOME | HOMEPATH
            SET TERMOUT
            SHOW...

Summary


In this chapter, we have covered common Data Guard issues using diagnostic data in a Data Guard environment. As Data Guard administrators, we have to identify the underlying reason of a Data Guard issue using this diagnostic data, and resolve the issue in the correct way. We think that the information and examples that we've seen in this chapter will be helpful for this purpose. The next chapter is about Data Guard best practices.

lock icon The rest of the chapter is locked
You have been reading a chapter from
Oracle Data Guard 11gR2 Administration : Beginner's Guide
Published in: Jun 2013 Publisher: Packt ISBN-13: 9781849687904
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $15.99/month. Cancel anytime}