My Oracle Blog: 2014

Friday, April 18, 2014

Enable Fast Start Failover using DGMGRL Oracle Data Guard Broker

Enable FSFO and starting the Observer

An Observer can be any system with access to the databases in DG configuration and at least Oracle Client installed.

1. Ensure that SRL's are configured on both the PRIMARY DB and

the STANDBY DB's

2. Ensure that the LogXptMode is set to SYNC

3. Set the FastStartFailoverTarget property of PRIMARY to desired

STANDBY DB.

edit database orcl1 set property

'FastStartFailoverTarget'='orcl2';

4. Upgrade protection Mode to MaxAvailability if necessary.

edit configuration set protection mode as MAxAvailability;

5. Enable Flashback on Primary and Standby DB's

Also set DB_FLASHBACK_RETENTION_TARGET to appropriate value.

This parameter retains the flashback logs such that the database can be

flashed back to the specified interval of time.

6. Start Observer

Connect to standby DB using DGMGRL and issue:

start observer;

7. Enable Fast start Failover

enable FAST_START failover;

This statement creates FSFP process

2014-03-12 02:51:34.451 DMON: Creating process FSFP

2014-03-12 02:51:37.549 FSFP: Process started

2014-03-12 02:51:38.550 DMON: FSFP successfully started

DGMGRL> show configuration

Configuration - dg_config

Protection Mode: MaxAvailability

Databases:

orcl1 - Primary database

orcl2 - (*) Physical standby database

Fast-Start Failover: ENABLED

Configuration Status:

SUCCESS

Oracle Data Guard Broker Configuration management

Configuration management

1. Change parameter, database property using DGMGRL

We will change the property 'StandbyFileManagement' for database ORCL2 to AUTO from MANUAL which is default setting for the parameter STANDBY_FILE_MANAGEMENT.

edit database orcl2 set property 'StandbyFileManagement'='AUTO';

DGMGRL> show database verbose orcl2

Database - orcl2

Role: PHYSICAL STANDBY

Intended State: APPLY-ON

Transport Lag: 0 seconds

Apply Lag: 0 seconds

Real Time Query: OFF

Instance(s):

orcl2

Properties:

DGConnectIdentifier = 'orcl2'

ObserverConnectIdentifier = ''

LogXptMode = 'ASYNC'

DelayMins = '0'

Binding = 'OPTIONAL'

MaxFailure = '0'

MaxConnections = '1'

ReopenSecs = '300'

NetTimeout = '30'

RedoCompression = 'DISABLE'

LogShipping = 'ON'

PreferredApplyInstance = ''

ApplyInstanceTimeout = '0'

ApplyParallel = 'AUTO'

StandbyFileManagement = 'AUTO'

ArchiveLagTarget = '0'

LogArchiveMaxProcesses = '4'

LogArchiveMinSucceedDest = '1'

DbFileNameConvert = 'orcl1, orcl2'

LogFileNameConvert = 'orcl1, orcl2'

FastStartFailoverTarget = ''

StatusReport = '(monitor)'

InconsistentProperties = '(monitor)'

InconsistentLogXptProps = '(monitor)'

SendQEntries = '(monitor)'

LogXptStatus = '(monitor)'

RecvQEntries = '(monitor)'

HostName = 'dg3'

SidName = 'orcl2'

StaticConnectIdentifier = '(DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=dg3)(PORT=1521))(CONNECT_DATA=(SERVICE_NAME=orcl2_DGMGRL)(INSTANCE_NAME=orcl2)(SERVER=DEDICATED)))'

StandbyArchiveLocation = 'USE_DB_RECOVERY_FILE_DEST'

AlternateLocation = ''

LogArchiveTrace = '0'

LogArchiveFormat = '%t_%s_%r.dbf'

TopWaitEvents = '(monitor)'

Database Status:

SUCCESS

We can change the ‘LogXptMode’ property as below. This will update the LOG_ARCHIVE_DEST_n parameter appropriately.

edit database [orcl1|orcl2] set property logxptmode=SYNC;

2. Change the broker configuration file location

The location of the broker configuration files is determined by the parameters DG_BROKER_CONFIG_FILEn (where n in {1,2}).

Default location is $ORACLE_HOME/dbs

Default names of configuration files are: dr{1|2}<db_unique_name>

CHANGING CONFIG FILE LOCATION:

The location of the configuration file can only be changed when DMON is not running.

· set DG_BROKER_START=FALSE

· alter system set DG_BROKER_CONFIG_FILEn=new loc;

· manually move files at OS level to new location

IF old/new location is ASM, use DBMS_FILE_TRANSFER.COPY_FILE

· start DMON process

3. Change DB STATEs

Turn off redo apply,

edit database orcl2 set STATE='APPLY-OFF';

Activate active DG aka REAL TIME QUERY,

edit database orcl2 set STATE='APPLY-OFF';

SQL> alter database open;

edit database orcl2 set STATE='APPLY-ON';

DGMGRL> show database orcl2

Database - orcl2

Role: PHYSICAL STANDBY

Intended State: APPLY-ON

Transport Lag: 0 seconds

Apply Lag: 0 seconds

Real Time Query: ON

Instance(s):

orcl2

Database Status:

SUCCESS

4. Change protection mode

a) Configure the SRL if necessary

b) Set LogXptMode aptly

edit database orcl1 set property 'LogXptMode'='SYNC';

edit database orcl2 set property 'LogXptMode'='SYNC';

c) now change the mode for the ENTIRE CONFIGURATION

edit configuration set protection mode as MaxAvailability;

Effects:

In the STANDBY alert log:

RFS[6]: Assigned to RFS process 7356

RFS[6]: Identified database type as 'physical standby': Client is LGWR SYNC pid 4753

Primary database is in MAXIMUM AVAILABILITY mode

Changing standby controlfile to MAXIMUM AVAILABILITY mode

Standby controlfile consistent with primary

RFS[6]: Selected log 4 for thread 1 sequence 65 dbid 1224670660 branch 840221191

Recovery of Online Redo Log: Thread 1 Group 4 Seq 65 Reading mem 0

Mem# 0: /u01/app/oracle/oradata/orcl2/srl04.log

Wed Mar 12 01:05:58 2014

RFS[7]: Assigned to RFS process 7366

RFS[7]: Identified database type as 'physical standby': Client is ARCH pid 4817

On the PRIMARY Site

SQL> select dest_id, dest_name, status, affirm, transmit_mode, process from v$archive_dest where status != 'INACTIVE';

DEST_ID DEST_NAME STATUS AFF TRANSMIT_MOD PROCESS

---------- ------------------------------ --------- --- ------------ ----------

1 LOG_ARCHIVE_DEST_1 VALID NO SYNCHRONOUS ARCH

2 LOG_ARCHIVE_DEST_2 VALID YES PARALLELSYNC LGWR

On STANDBY site :

SQL> select dest_id, dest_name, status, affirm, transmit_mode, process from v$archive_dest where status != 'INACTIVE';

DEST_ID DEST_NAME STATUS AFF TRANSMIT_MOD PROCESS

---------- ------------------------------ --------- --- ------------ ----------

1 LOG_ARCHIVE_DEST_1 VALID NO SYNCHRONOUS ARCH

2 LOG_ARCHIVE_DEST_2 VALID NO ASYNCHRONOUS LGWR

32 STANDBY_ARCHIVE_DEST VALID YES SYNCHRONOUS RFS

How to install Data Guard Broker

Installing DG Broker

1. On both PRIMARY db and STANDBY db's set DG_BROKER_START=TRUE

2. Add static service DGMGRL entry on both DB's in the LISTENER.ora file

The static service entry name is of the following format:

<DB_UNIQUE_NAME>_DGMGRL.<DB_DOMAIN>

SID_LIST_LISTENER =

(SID_LIST =

(SID_DESC =

(GLOBAL_DBNAME = ORCL1_DGMGRL)

(ORACLE_HOME = /u01/app/oracle/product/11.2.0/dbhome_1)

(SID_NAME = ORCL1)

)

SID_LIST_LISTENER =

(SID_LIST =

(SID_DESC =

(GLOBAL_DBNAME = ORCL2_DGMGRL)

(ORACLE_HOME = /u01/app/oracle/product/11.2.0/dbhome_1)

(SID_NAME = ORCL2)

)

3. CREATE CONFIGURATION

create configuration dg_config as

primary database is 'orcl1'

connect identifier is 'orcl1';

This MUST be done from DGMGRL utility while connected to the primary database.

If you try to create configuration while connected to standby database, the following error occurs:

Error: ORA-16584: operation cannot be performed on a standby database

4. Add the standby DB to configuration

ADD database 'orcl2' as

connect identifier is 'orcl2';

adds the standby db.

5. Show configuration

DGMGRL> show configuration

Configuration - dg_config

Protection Mode: MaxPerformance

Databases:

orcl1 - Primary database

orcl2 - Physical standby database

Fast-Start Failover: DISABLED

Configuration Status:

DISABLED

6. Enable configuration

DGMGRL> show configuration

From here onwards, the DG BROKER takes over the management of the data guard environment.

This is evident from the following message in the alert log itself.

ALTER SYSTEM SET dg_broker_start=TRUE SCOPE=BOTH;

Tue Mar 11 23:20:31 2014

DMON started with pid=27, OS id=5400

Starting Data Guard Broker (DMON)

Tue Mar 11 23:20:42 2014

INSV started with pid=28, OS id=5409

Tue Mar 11 23:33:23 2014

NSV0 started with pid=30, OS id=5606

Tue Mar 11 23:33:27 2014

RSM0 started with pid=31, OS id=5615

ALTER SYSTEM SET log_archive_trace=0 SCOPE=BOTH SID='orcl2';

ALTER SYSTEM SET log_archive_format='%t_%s_%r.dbf' SCOPE=SPFILE SID='orcl2';

ALTER SYSTEM SET standby_file_management='MANUAL' SCOPE=BOTH SID='*';

ALTER SYSTEM SET archive_lag_target=0 SCOPE=BOTH SID='*';

ALTER SYSTEM SET log_archive_max_processes=4 SCOPE=BOTH SID='*';

ALTER SYSTEM SET log_archive_min_succeed_dest=1 SCOPE=BOTH SID='*';

ALTER SYSTEM SET db_file_name_convert='orcl1','orcl2' SCOPE=SPFILE;

ALTER SYSTEM SET log_file_name_convert='orcl1','orcl2' SCOPE=SPFILE;

ALTER SYSTEM SET fal_server='orcl1' SCOPE=BOTH;

ALTER DATABASE RECOVER MANAGED STANDBY DATABASE THROUGH ALL SWITCHOVER DISCONNECT USING CURRENT LOGFILE

Attempt to start background Managed Standby Recovery process (orcl2)

Tue Mar 11 23:33:32 2014

MRP0 started with pid=32, OS id=5620

MRP0: Background Managed Standby Recovery process started (orcl2)

started logmerger process

Tue Mar 11 23:33:37 2014

Managed Standby Recovery starting Real Time Apply

Parallel Media Recovery started with 2 slaves

Waiting for all non-current ORLs to be archived...

All non-current ORLs have been archived.

Media Recovery Waiting for thread 1 sequence 58 (in transit)

Recovery of Online Redo Log: Thread 1 Group 6 Seq 58 Reading mem 0

Mem# 0: /u01/app/oracle/oradata/orcl2/srl06.log

Tue Mar 11 23:33:38 2014

Completed: ALTER DATABASE RECOVER MANAGED STANDBY DATABASE THROUGH ALL SWITCHOVER DISCONNECT USING CURRENT LOGFILE

Data Guard Broker - Basics

DG BROKER is the management framework for data guard, and allows the DBA’s to automate many of the tasks involved in configuring and monitoring an Oracle Data Guard configuration.

DG Broker provides two interfaces:

1. DGMGRL utility – this is a command line interface to the DG Broker

2. OEM GUI – OEM provides a graphical interface to the DG Broker

Some features such as standby database creation, can be automated and are only available through the GUI interface. If you use the DGMGRL utility and want to create a standby atabase, you must do so manually.

Benefits of DG Broker:

· Disaster protection

· High availability and scalability with RAC

· Automated creation of DG configuration

· Easy creation of additional standby DB’s (through OEM)

· Simplified and centralized management

· Simplified switchover and failover

· Transparent to applications

DG broker handles the following logical entities:

· Configuration of databases

A DG Broker configuration consists of following entities:

- Configuration Object: it is a named collection of database profiles. A database profile in turn is a description of a database object including its current state, status and properties. A configuration object profiles one primary database and its standby databases.

- Database objects: these correspond to the primary and standby databases. Broker uses the database object’s profile to manage and control the state of a single database on the system.

- Instance objects: Broker treats databases as a collection of one or more named instances. Broker automatically discovers instances and associates them with their databases.

· A single database

Data Guard Monitor and DG Configuration files

The configuration, control and monitoring functions of the Broker are implemented by server-side software and configuration files.

This software is known as Data Guard Monitor.

When the Broker is started, the DMON (data guard monitor) background process runs for every database instance that is managed by the broker. DMON process interacts with the local instance and the DMON processes of all other instances in the broker configuration to perform the requested operation. It is also responsible for monitoring the health of broker configuration and for ensuring that every database has a consistent description of the configuration.

This persistent description of configuration is stored in the binary configuration file. A copy of this file is maintained by the DMON process for each database that belongs to the broker configuration. Configuration file describes the states and properties of the databases in configuration.

Using SPFile is a must when using DG Broker to ensure that the Broker can update the values of parameters in both the database (parameter file) and the configuration file.

Broker Processes

DMON – Data Guard Monitor process - This is the main process, responsible for all broker actions, as well as maintaining the configuration files. This process can be enabled or disabled using DG_BROKER_START parameter.

RSM – Broker Resource Manager – This process is responsible for handling SQL commands used by broker that need to be executed on one of the databases in the configuration. These commands need to be run as a result of change in configuration made through the DGMGRL, or the configuration commands run by broker during database startup.

NSVn – Data Guard Net Services – These processes are responsible for making contact with remote databases and sending across any work items to remote databases.

DRCn – Data Guard Network Receiver – The network receiver processes establish connection from source database NSVn process. This is similar to the LNS and RFS mechanism for redo transport.

Thursday, April 17, 2014

Oracle wait events

There are various events that might induce waiting time for a database operation.

These events are known are wait events.

These wait events are classified into various wait classes based on the type of operation they affect.

The wait events can be observed using the following views:

v$SESSION_EVENT

v$SESSION_WAIT

v$SYSTEM_EVENT

v$EVENT_NAME

Following are descriptions of some important wait events:

db sequential file reads

This wait event occurs when Oracle process wants to read a block that is not currently in the Buffer cache, and it is waiting to read the block into SGA from disk.

A sequential read is generally a single-block read. Sequential read reads data into contiguous memory in SGA.

Causes:

- use of unselective index

- fragmented index

- High I/O on particular disk

- bad application design

Action:

Faster data block identification, faster disk to memory transfer, caching of data in memory will decrease the occurrence and intensity of this wait event.

Attention needs to be paid to following aspects:

- check that right index is being used to access the table

- check the column order of index with the WHERE clause in SQL

- use partitioning to reduce the number of blocks visited

- statistics should be up to date

- relocate HOT datafiles

- consider using KEEP pool to cache frequently accessed indexes/table

- check the execution plans to ascertain the access paths, join methods

db file scattered reads

This wait happens when a session is waiting for multi-block IO to complete.

A scattered read reads multiple blocks and scatters them into different buffers in SGA.

Typically happens during Full Table Scans and Index Fast Full Scans.

The number of consecutive blocks read at one time is determined by parameter,

DB_FILE_MULTIBLOCK_READ_COUNT.

Causes:

- Full Table Scans

- Index Fast Full Scans

Actions:

- If an application has been running fine for some time, and there has been no code

change, and this wait event starts to show up, check if one or more indexes have

been dropped

- optimize multiblock read count by setting appropriate

DB_FILE_MULTIBLOCK_READ_COUNT

- Consider caching frequently accessed indexes/tables in KEEP pool

- use partitioning to reduce the number of blocks visited

- optimize SQL with the goal to reduce the number of physical and logical reads

- check if an Index range scan or unique scan can be done instead of FTS or IFFS

- make sure statistics are up to date

buffer busy waits

This wait happens when a sessions wants to access a block in Buffer cache but cant because the block is busy.

A buffer must be PINNED before it can be read/modified. Only one process can pin a buffer at one time.

This event represents read/read, read/write and write/write contention.

The reason code is present in the P3 column of v$SESSION_WAIT.

Causes:

- another session is reading the block into the buffer

- another session holds the block in an incompatible mode to our request

It can get intensified if

- the block size is large, thus block contains a high number of rows

- lot of sessions are asking for the same blocks

Actions:

Main method to reduce the buffer busy waits is to reduce the total IO on system

Data Blocks

- eliminate hot blocks from application (probably by providing redundancy)

- check for repeatedly scanned, unselective indexes

- try rebuilding object with higher PCTFREE so as to reduce the

number of rows per block

- use automatic segment space management

Segment Header

- increase the number of FREELISTs

- use automatic segment space management

Undo Blocks

- increase number of rollback segments

- use automatic undo management

log file parallel write

This wait is the LGWR wait that occurs when the OS writes contents of the redo buffer to the online log files.

LGWR does a group write for multiple commits/rollbacks, hence, parallel write.

Causes:

- large amount of redo being generated

- slow IO subsystem

Actions:

- reduce amount of redo being generated

- don't leave tablespaces in HOT backup mode longer than necessary

- use faster disks for redo log files

- ensure that disks holding the archived logs and those holding online redo logs are

separate to reduce LGWR, ARCn contention

- consider using NOLOGGING option

log file sync

This wait is logged against the oracle foreground process that is waiting for COMMIT/ROLLBACK to complete

Actions:

- reduce number of commits by batching transactions, so that there are fewer

COMMIT operations

free buffer waits

This wait happens when we are waiting for free buffer in the buffer cache but there are none as there are too many dirty buffers

Causes:

- either buffer cache is too small or DBWn is too slow in flushing bocks to disk

- DBWn is unable to keep up to the write requests

- Checkpoints are happening too fast. This in turn could be due to

- high database activity

- small Online Redo Log file size

- Large sorts and FTS's are filling the buffer cache faster than

DBWn is able to write to disk

- if number of blocks that need to be written to disk is greater than

number of blocks that DBWn can write in one batch

Actions:

- reduce checkpoint frequency, increase size of redo logfiles

- check the V$DB_CACHE_ADVICE to optimally size the DB buffer

- ensure hotspots don't exists by spreading datafiles over disks and disk controllers

- pre-sorting, reorganization of data can help

enqueue waits

This wait events indicates a wait for a lock that in held by other session/s in a mode incompatible to the requested mode.

There are over 60 types of enqueue waits. Can be viewed through V$ENQUEUE_STAT.

Major among them are:

TX - transaction lock

This indicates contention for row level lock. It is related to buffer busy waits

Causes:

- when a transaction tries to update/delete rows that are currently locked by

another session

- usually an application issue (bad application design)

Actions:

- Since this is a blocking session wait, the action is to find the blocking session/s

and take appropriate action

TM - DML enqueue lock

This is related to transaction management.

Causes:

- unindexed foreign key columns

Actions:

- index the foreign key columns

LATCH:

Latches are lightweight mechanisms that are used to serialize access to memory structures.

It is not possible to queue for latch - a process either gets the latch or does not.

Various types of latches can be seen through v$LATCH view.

Latches apply only to memory structures in SGA. They do not apply to database objects.

The SGA has many latches and they exist to protect various memory structures from potential corruption by concurrent access.

Memory structures protected by latches include things like buffer cache, java pool and library cache.

A latch is requested by process in 2 modes:

Immediate mode - A process asks for latch, if it fails to acquire, the control is returned to the process

Willing to wait mode - A process asks for a latch,

if it fails to acquire,

it keeps requesting it for specified number of

times (spinning),

if it still fails to acquire,

it sleeps, then wakes up after certain specified

time and begins the whole process again.

cache buffer chain latch

This latch is acquired when searching for data blocks.

Buffer cache is a chain of blocks and each chain is protected by a latch while it is being scanned.

Causes:

- when multiple sessions repeatedly access one or more blocks that are protected by

the same child 'cache buffer chain' latch

- sql statements with high buffer gets (logical reads) per execution

Actions:

- reduce logical IO rates by tuning and minimizing IO requirements of involved SQL

- reducing number of rows per block

- reducing block size

cache buffer LRU chain latch

This latch is acquired to introduce a new block into the buffer cache based on LRU block replacement policy.

Causes:

- intense buffer cache activity caused by inefficient SQL

- SQL scanning large unselective indexes, performing FTS

Actions:

- implement multiple buffer pools

- increase buffer cache size

Direct path reads and direct path writes wait events are related to operations that are performed in PGA like sorting, group by operation, hash join.

direct path reads

Direct path reads are used by oracle when reading directly into PGA bypassing SGA.

Direct path reads are common in DSS or during heavy batch activity.

Causes:

- sorting operations, direct path reads usually occur in connection with

temporary segments

- sql statements that require sort (order by, group by, union, distinct, rollup)

- join methods that require sort (hash join, sort merge join)

Actions:

- check the sql to see if the IO can be reduced

- set PGA_AGGREGATE_TARGET to appropriate value as advised by

V$PGA_TARGET_ADVICE

- wherever possible, use Hash Join instead of Sort Merge join, and Nested Loops

instead of Hash Join

- make sure optimizer selects right driving table

direct path writes

These are waits that are associated with direct write operations that write data into user's PGA to data files or to temporary segments

Causes:

- direct load operations ( like CTAS)

- parallel DML operations

- Sort IO (when a sort doesn't fit into PGA)

Actions:

- ensure OS asynchronous IO is configured correctly

- ensure no disks are IO bound

latch free waits

This wait indicates that the process is waiting for a latch that is currently busy.

When a process is spinning, the spinning is recorded as CPU time, and when it goes to sleep, it is recorded as 'latch free wait' event.

library cache latch

This latch prevents the cached sql statements and object definitions staored in the library cache within the shared pool.

This latch must be acquired to add a statement to the library cache.

This latch is to ensure that application is reusing as much SQL statement representation as possible.

Causes:

- application is making heavy use of literal sql

Actions:

- consider increasing shared pool size, based on the shared pool advisor

(v$SHARED_POOL_ADVICE)

- use bind variables wherever possible

shared pool latch

This latch is used to protect critical operations when allocating and freeing memory in shared pool.

Causes:

- the shared pool latch waits and the library cache latch waits are generally due to intense hard parsing

Actions:

- avoid hard parses where possible

- eliminate literal SQL, use bind variables

- consider increasing shared pool size, based on the shared pool advisor

(v$SHARED_POOL_ADVICE)

- try setting CURSOR_SHARING to FORCE if hard parses persist

row cache objects latch

This latch comes into play when processes are attempting to access the cached data dictionary values.

Generally, this latch is not a source of contention, but if there is significant wait on this latch, SHARED_POOL_SIZE needs to be corrected.

Tuning library cache tunes the row cache indirectly.