rman

Using current_scn for RMAN incremental ? Think again..

I am bloging about experience we had using rman incremental backup for syncing standby database.

We had a standby database which lagged by considerable time and we decided to use rman incremental database to make it current. We have done this in past and used normal procedure of taking current_scn from V$database on standby and took rman incremental backup from this scn on primary database.

When we did the recovery it finished successfully but on starting MRP  it still required old archive log.
We tried manual recovery and still it was failing and asking for same old archive. This was strange and I ran following queries to find out the SCN

SELECT CURRENT_SCN FROM V$DATABASE;
CURRENT_SCN
 -----------
 10641805448174

Checking the minimum checkpoint_change# for V$datafile reported following SCN which was less then current_SCN.

SQL> select min(checkpoint_change#) from V$datafile;
 MIN(CHECKPOINT_CHANGE#)
 ------------------------------
 10641804610367

At this time I decided to check first_change# for the archive required by standby

select first_change# from GV$ARCHIVED_LOG where THREAD#=2 and SEQUENCE#=699;
 FIRST_CHANGE#
 ------------------------------
 10618920961649

This value was way less and was different then the initial current_scn which we used to take rman incremental backup. Checking on My Oracle Support , found note 836986.1 which recommended running following query

SQL> select min(fhscn) from x$kcvfh;
MIN(FHSCN)
 ----------------
 10618920961649

This matches with the scn from V$archived_log. Note recommends to take lower value of x$kcvfh and V$database.
So we took backup again with this SCN from primary. Since we have RAC database, we used channels on both instance

run
 {
 allocate channel c1 device type disk format '/VOL301/rmanbackup/stby_PROD_%U' connect 'sys/password@PROD1';
 allocate channel c2 device type disk format '/VOL301/rmanbackup/stby_PROD_%U' connect 'sys/password@PROD1';
 allocate channel c3 device type disk format '/VOL301/rmanbackup/stby_PROD_%U' connect 'sys/password@PROD2';
 allocate channel c4 device type disk format '/VOL301/rmanbackup/stby_PROD_%U' connect 'sys/password@PROD2';
 backup incremental from scn 10618920961649 database format '/VOL301/rmanbackup/stby_PROD_%U' tag 'Tag_INCR_PROD';
 }

We copied the backups and also restored the controlfile and cataloged the backups. Restoring now did the trick

run
 {
 allocate channel c1 device type disk ;
 allocate channel c2 device type disk ;
 allocate channel c3 device type disk ;
 allocate channel c4 device type disk ;
 recover database noredo;
 }

I think x$kcvfh should also be part of documentation as it wil help people check both scn before proceeding. Anyways this seems to be some special case as we have used this rman incremental approach lot of times and faced this first time.
If anyone has clue on what caused this, then I would be really interested to know it.

Physical Corruption: ORA-1578 part 3

Recently we encountered a block corruption issue in 10g database on Linux x86 64 bit and using ASM for storing database files.
Saurabh had earlier written articles on ora-1578.

http://askdba.org/weblog/2008/04/physical-corruption-ora-1578-part-1/
http://askdba.org/weblog/2008/05/physical-corruption-ora-1578-part-2/

It was first time I was working on a corruption issue, so I followed above notes and found them quite useful. Good thing about ora-1578 error message is that it gives the problematic block and datafile number

ORA-01578: ORACLE data block corrupted (file # 699, block # 188654)
ORA-01110: data file 699:
'+DG_IDX/james/datafile/scott_emp_detail.381.705692005'

Querying dba_extents revealed that it was a Fact table and one partition was giving error.Since we had rman backup for this database we directly went for recovery using blockrecover command

Recovery Manager: Release 10.2.0.4.0 - Production on Sat Aug 7 01:48:51 2010

Copyright (c) 1982, 2007, Oracle.  All rights reserved.

connected to target database: james (DBID=1812247572)

RMAN> run {blockrecover datafile 699 block 188654;}

Starting blockrecover at 07-AUG-10
using target database control file instead of recovery catalog
allocated channel: ORA_DISK_1
channel ORA_DISK_1: sid=2231 instance=james4 devtype=DISK
allocated channel: ORA_DISK_2
channel ORA_DISK_2: sid=2190 instance=james4 devtype=DISK
allocated channel: ORA_DISK_3
channel ORA_DISK_3: sid=2142 instance=james4 devtype=DISK
allocated channel: ORA_DISK_4
channel ORA_DISK_4: sid=2186 instance=james4 devtype=DISK
allocated channel: ORA_DISK_5
channel ORA_DISK_5: sid=2170 instance=james4 devtype=DISK
allocated channel: ORA_DISK_6
channel ORA_DISK_6: sid=2203 instance=james4 devtype=DISK
allocated channel: ORA_DISK_7
channel ORA_DISK_7: sid=2240 instance=james4 devtype=DISK
allocated channel: ORA_DISK_8
channel ORA_DISK_8: sid=2195 instance=james4 devtype=DISK

channel ORA_DISK_2: restoring block(s)
channel ORA_DISK_2: specifying block(s) to restore from backup set
restoring blocks of datafile 00699
channel ORA_DISK_2: reading from backup piece /RMNGD_RMAN1/backup/james/29lk4ift_1_1

user interrupt received
RMAN-00571: ===========================================================
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: ===========================================================
RMAN-03002: failure of blockrecover command at 08/07/2010 02:52:27
RMAN-03099: job cancelled at user request

We had to cancel the block recovery as even after 1 hour it could not locate the block from backup piece. Checking backup piece found that this backup piece was of 184G

ls -lh /RMNGD_RMAN1/backup/james/29lk4ift_1_1
-rw-r-----  1 oracle dba 184G Jul 31 18:17 /RMNGD_RMAN1/backup/james/29lk4ift_1_1

We had stopped the data load for this table but delaying further was not possible as it was impacting application. Finally we  decided to use dbms_repair table to allow DML to mark the block as corrupt .
This table was FACT table which could have been loaded again (lucky for us 🙂 ). Before marking it corrupt, we took a dump of block

ALTER SYSTEM DUMP DATAFILE 699 block 188654;
BEGIN
DBMS_REPAIR.ADMIN_TABLES (
     TABLE_NAME => 'REPAIR_TABLE',
     TABLE_TYPE => dbms_repair.repair_table,
     ACTION     => dbms_repair.create_action,
     TABLESPACE => 'USERS');
END;
/

SET SERVEROUTPUT ON
DECLARE num_corrupt INT;
BEGIN
num_corrupt := 0;
DBMS_REPAIR.CHECK_OBJECT (
     SCHEMA_NAME => 'SCOTT',
     OBJECT_NAME => 'EMP_DETAIL_FACT',
     PARTITION_NAME => 'SYS_P1140013',
     REPAIR_TABLE_NAME => 'REPAIR_TABLE',
     CORRUPT_COUNT =>  num_corrupt);
DBMS_OUTPUT.PUT_LINE('number corrupt: ' || TO_CHAR (num_corrupt));
END;
/

SELECT OBJECT_NAME, BLOCK_ID, CORRUPT_TYPE, MARKED_CORRUPT,CORRUPT_DESCRIPTION, REPAIR_DESCRIPTION FROM REPAIR_TABLE;

  SET SERVEROUTPUT ON
  DECLARE num_fix INT;
  BEGIN
  num_fix := 0;
  DBMS_REPAIR.FIX_CORRUPT_BLOCKS (
      SCHEMA_NAME => 'SCOTT',
      OBJECT_NAME => 'EMP_DETAIL_FACT',
     PARTITION_NAME => 'SYS_P1140013',
      OBJECT_TYPE => dbms_repair.table_object,
      REPAIR_TABLE_NAME => 'REPAIR_TABLE',
      FIX_COUNT=> num_fix);
  DBMS_OUTPUT.PUT_LINE('num fix: ' || TO_CHAR(num_fix));
  END;
  /

  execute DBMS_REPAIR.SKIP_CORRUPT_BLOCKS('SCOTT','EMP_DETAIL_FACT')

Refer 10g doc to get details on DBMS_REPAIR.

This allowed the application to load data again. Later we worked with application team and created a new segment and renamed the old one to *_bkp and then dropped it.
Next day we got a alert informing errors with our rman backup

ORA-19566: exceeded limit of 0 corrupt blocks for file +DG_IDX/james/datafile/scott_emp_detail.381.705692005
continuing other job steps, job failed will not be re-run
channel ORA_DISK_8: finished piece 1 at AUG-10-2010 01:56:45
piece handle=/RMNGD_RMAN3/backup/james/i2lkuv0q_1_1 tag=james_INC_20100810 comment=NONE
channel ORA_DISK_8: backup set complete, elapsed time: 00:26:43
channel ORA_DISK_1: finished piece 1 at AUG-10-2010 02:13:01
piece handle=/RMNGD_RMAN1/backup/james/hrlkuv0i_1_1 tag=james_INC_20100810 comment=NONE
channel ORA_DISK_1: backup set complete, elapsed time: 00:43:07
channel ORA_DISK_6: finished piece 1 at AUG-10-2010 02:21:26
piece handle=/RMNGD_RMAN2/backup/james/i0lkuv0n_1_1 tag=james_INC_20100810 comment=NONE
channel ORA_DISK_6: backup set complete, elapsed time: 00:51:27
RMAN-00571: ===========================================================
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: ===========================================================

Backup job failed again as the table was still lying in recyclebin.

03:16:47 SQL> select * from dba_recyclebin where original_name='EMP_DETAIL_FACT_BKP';

SCOTT                       BIN$jXvLFhsgCOXgQA1M6jspCQ==03:16:47 SQL> select * from dba_recyclebin where original_name=\'EMP_DETAIL_FACT_BKP\';

SCOTT                       BIN$jXvLFhsgCOXgQA1M6jspCQ==03:16:47 SQL> select * from dba_recyclebin where original_name='EMP_DETAIL_FACT_BKP';

SCOTT                       BIN$jXvLFhsgCOXgQA1M6jspCQ==03:16:47 SQL> select * from dba_recyclebin where original_name='EMP_DETAIL_FACT_BKP';

SCOTT                       BIN$jXvLFhsgCOXgQA1M6jspCQ==$0 EMP_DETAIL_FACT_BKP
DROP                            Table Partition
scott_emp_detail                                                                                                      2009-11-09:18:41:14
2010-08-10:09:40:39                                                    9.9476E+12                                                                                                                                            NO        NO               4465594     4465594   4465594
      1088

SCOTT                       BIN$jXvLFhsgCOXgQA1M6jspCQ==$0 EMP_DETAIL_FACT_BKP
DROP                            Table Partition
scott_emp_detail                                                                                                      2009-11-09:18:41:14
2010-08-10:09:40:39                                                    9.9476E+12
                                                                                                                                               NO        NO               4465594     4465594   4465594
     12896 EMP_DETAIL_FACT_BKP
DROP                            Table Partition
scott_emp_detail                                                                                                      2009-11-09:18:41:14
2010-08-10:09:40:39                                                    9.9476E+12                                                                                                                                            NO        NO               4465594     4465594   4465594
      1088

SCOTT                       BIN$jXvLFhsgCOXgQA1M6jspCQ==03:16:47 SQL> select * from dba_recyclebin where original_name='EMP_DETAIL_FACT_BKP';

SCOTT                       BIN$jXvLFhsgCOXgQA1M6jspCQ==$0 EMP_DETAIL_FACT_BKP
DROP                            Table Partition
scott_emp_detail                                                                                                      2009-11-09:18:41:14
2010-08-10:09:40:39                                                    9.9476E+12                                                                                                                                            NO        NO               4465594     4465594   4465594
      1088

SCOTT                       BIN$jXvLFhsgCOXgQA1M6jspCQ==$0 EMP_DETAIL_FACT_BKP
DROP                            Table Partition
scott_emp_detail                                                                                                      2009-11-09:18:41:14
2010-08-10:09:40:39                                                    9.9476E+12
                                                                                                                                               NO        NO               4465594     4465594   4465594
     12896 EMP_DETAIL_FACT_BKP
DROP                            Table Partition
scott_emp_detail                                                                                                      2009-11-09:18:41:14
2010-08-10:09:40:39                                                    9.9476E+12
                                                                                                                                               NO        NO               4465594     4465594   4465594
     12896 EMP_DETAIL_FACT_BKP
DROP                            Table Partition
scott_emp_detail                                                                                                      2009-11-09:18:41:14
2010-08-10:09:40:39                                                    9.9476E+12                                                                                                                                            NO        NO               4465594     4465594   4465594
      1088

SCOTT                       BIN$jXvLFhsgCOXgQA1M6jspCQ==03:16:47 SQL> select * from dba_recyclebin where original_name='EMP_DETAIL_FACT_BKP';

SCOTT                       BIN$jXvLFhsgCOXgQA1M6jspCQ==03:16:47 SQL> select * from dba_recyclebin where original_name='EMP_DETAIL_FACT_BKP';

SCOTT                       BIN$jXvLFhsgCOXgQA1M6jspCQ==$0 EMP_DETAIL_FACT_BKP
DROP                            Table Partition
scott_emp_detail                                                                                                      2009-11-09:18:41:14
2010-08-10:09:40:39                                                    9.9476E+12                                                                                                                                            NO        NO               4465594     4465594   4465594
      1088

SCOTT                       BIN$jXvLFhsgCOXgQA1M6jspCQ==$0 EMP_DETAIL_FACT_BKP
DROP                            Table Partition
scott_emp_detail                                                                                                      2009-11-09:18:41:14
2010-08-10:09:40:39                                                    9.9476E+12
                                                                                                                                               NO        NO               4465594     4465594   4465594
     12896 EMP_DETAIL_FACT_BKP
DROP                            Table Partition
scott_emp_detail                                                                                                      2009-11-09:18:41:14
2010-08-10:09:40:39                                                    9.9476E+12                                                                                                                                            NO        NO               4465594     4465594   4465594
      1088

SCOTT                       BIN$jXvLFhsgCOXgQA1M6jspCQ==03:16:47 SQL> select * from dba_recyclebin where original_name='EMP_DETAIL_FACT_BKP';

SCOTT                       BIN$jXvLFhsgCOXgQA1M6jspCQ==$0 EMP_DETAIL_FACT_BKP
DROP                            Table Partition
scott_emp_detail                                                                                                      2009-11-09:18:41:14
2010-08-10:09:40:39                                                    9.9476E+12                                                                                                                                            NO        NO               4465594     4465594   4465594
      1088

SCOTT                       BIN$jXvLFhsgCOXgQA1M6jspCQ==$0 EMP_DETAIL_FACT_BKP
DROP                            Table Partition
scott_emp_detail                                                                                                      2009-11-09:18:41:14
2010-08-10:09:40:39                                                    9.9476E+12
                                                                                                                                               NO        NO               4465594     4465594   4465594
     12896 EMP_DETAIL_FACT_BKP
DROP                            Table Partition
scott_emp_detail                                                                                                      2009-11-09:18:41:14
2010-08-10:09:40:39                                                    9.9476E+12
                                                                                                                                               NO        NO               4465594     4465594   4465594
     12896 EMP_DETAIL_FACT_BKP
DROP                            Table Partition
scott_emp_detail                                                                                                      2009-11-09:18:41:14
2010-08-10:09:40:39                                                    9.9476E+12
                                                                                                                                               NO        NO               4465594     4465594   4465594
     12896 EMP_DETAIL_FACT_BKP
DROP                            Table Partition
scott_emp_detail                                                                                                      2009-11-09:18:41:14
2010-08-10:09:40:39                                                    9.9476E+12                                                                                                                                            NO        NO               4465594     4465594   4465594
      1088

SCOTT                       BIN$jXvLFhsgCOXgQA1M6jspCQ==03:16:47 SQL> select * from dba_recyclebin where original_name=\'EMP_DETAIL_FACT_BKP\';

SCOTT                       BIN$jXvLFhsgCOXgQA1M6jspCQ==03:16:47 SQL> select * from dba_recyclebin where original_name='EMP_DETAIL_FACT_BKP';

SCOTT                       BIN$jXvLFhsgCOXgQA1M6jspCQ==03:16:47 SQL> select * from dba_recyclebin where original_name='EMP_DETAIL_FACT_BKP';

SCOTT                       BIN$jXvLFhsgCOXgQA1M6jspCQ==$0 EMP_DETAIL_FACT_BKP
DROP                            Table Partition
scott_emp_detail                                                                                                      2009-11-09:18:41:14
2010-08-10:09:40:39                                                    9.9476E+12                                                                                                                                            NO        NO               4465594     4465594   4465594
      1088

SCOTT                       BIN$jXvLFhsgCOXgQA1M6jspCQ==$0 EMP_DETAIL_FACT_BKP
DROP                            Table Partition
scott_emp_detail                                                                                                      2009-11-09:18:41:14
2010-08-10:09:40:39                                                    9.9476E+12
                                                                                                                                               NO        NO               4465594     4465594   4465594
     12896 EMP_DETAIL_FACT_BKP
DROP                            Table Partition
scott_emp_detail                                                                                                      2009-11-09:18:41:14
2010-08-10:09:40:39                                                    9.9476E+12                                                                                                                                            NO        NO               4465594     4465594   4465594
      1088

SCOTT                       BIN$jXvLFhsgCOXgQA1M6jspCQ==03:16:47 SQL> select * from dba_recyclebin where original_name='EMP_DETAIL_FACT_BKP';

SCOTT                       BIN$jXvLFhsgCOXgQA1M6jspCQ==$0 EMP_DETAIL_FACT_BKP
DROP                            Table Partition
scott_emp_detail                                                                                                      2009-11-09:18:41:14
2010-08-10:09:40:39                                                    9.9476E+12                                                                                                                                            NO        NO               4465594     4465594   4465594
      1088

SCOTT                       BIN$jXvLFhsgCOXgQA1M6jspCQ==$0 EMP_DETAIL_FACT_BKP
DROP                            Table Partition
scott_emp_detail                                                                                                      2009-11-09:18:41:14
2010-08-10:09:40:39                                                    9.9476E+12
                                                                                                                                               NO        NO               4465594     4465594   4465594
     12896 EMP_DETAIL_FACT_BKP
DROP                            Table Partition
scott_emp_detail                                                                                                      2009-11-09:18:41:14
2010-08-10:09:40:39                                                    9.9476E+12
                                                                                                                                               NO        NO               4465594     4465594   4465594
     12896 EMP_DETAIL_FACT_BKP
DROP                            Table Partition
scott_emp_detail                                                                                                      2009-11-09:18:41:14
2010-08-10:09:40:39                                                    9.9476E+12                                                                                                                                            NO        NO               4465594     4465594   4465594
      1088

SCOTT                       BIN$jXvLFhsgCOXgQA1M6jspCQ==03:16:47 SQL> select * from dba_recyclebin where original_name='EMP_DETAIL_FACT_BKP';

SCOTT                       BIN$jXvLFhsgCOXgQA1M6jspCQ==03:16:47 SQL> select * from dba_recyclebin where original_name='EMP_DETAIL_FACT_BKP';

SCOTT                       BIN$jXvLFhsgCOXgQA1M6jspCQ==$0 EMP_DETAIL_FACT_BKP
DROP                            Table Partition
scott_emp_detail                                                                                                      2009-11-09:18:41:14
2010-08-10:09:40:39                                                    9.9476E+12                                                                                                                                            NO        NO               4465594     4465594   4465594
      1088

SCOTT                       BIN$jXvLFhsgCOXgQA1M6jspCQ==$0 EMP_DETAIL_FACT_BKP
DROP                            Table Partition
scott_emp_detail                                                                                                      2009-11-09:18:41:14
2010-08-10:09:40:39                                                    9.9476E+12
                                                                                                                                               NO        NO               4465594     4465594   4465594
     12896 EMP_DETAIL_FACT_BKP
DROP                            Table Partition
scott_emp_detail                                                                                                      2009-11-09:18:41:14
2010-08-10:09:40:39                                                    9.9476E+12                                                                                                                                            NO        NO               4465594     4465594   4465594
      1088

SCOTT                       BIN$jXvLFhsgCOXgQA1M6jspCQ==03:16:47 SQL> select * from dba_recyclebin where original_name='EMP_DETAIL_FACT_BKP';

SCOTT                       BIN$jXvLFhsgCOXgQA1M6jspCQ==$0 EMP_DETAIL_FACT_BKP
DROP                            Table Partition
scott_emp_detail                                                                                                      2009-11-09:18:41:14
2010-08-10:09:40:39                                                    9.9476E+12                                                                                                                                            NO        NO               4465594     4465594   4465594
      1088

SCOTT                       BIN$jXvLFhsgCOXgQA1M6jspCQ==$0 EMP_DETAIL_FACT_BKP
DROP                            Table Partition
scott_emp_detail                                                                                                      2009-11-09:18:41:14
2010-08-10:09:40:39                                                    9.9476E+12
                                                                                                                                               NO        NO               4465594     4465594   4465594
     12896 EMP_DETAIL_FACT_BKP
DROP                            Table Partition
scott_emp_detail                                                                                                      2009-11-09:18:41:14
2010-08-10:09:40:39                                                    9.9476E+12
                                                                                                                                               NO        NO               4465594     4465594   4465594
     12896 EMP_DETAIL_FACT_BKP
DROP                            Table Partition
scott_emp_detail                                                                                                      2009-11-09:18:41:14
2010-08-10:09:40:39                                                    9.9476E+12
                                                                                                                                               NO        NO               4465594     4465594   4465594
     12896 EMP_DETAIL_FACT_BKP
DROP                            Table Partition
scott_emp_detail                                                                                                      2009-11-09:18:41:14
2010-08-10:09:40:39                                                    9.9476E+12
                                                                                                                                               NO        NO               4465594     4465594   4465594
     12896

So we decided to purge the recyclebin and resized datafile to ensure that we get rid of corrupt block. As a result maximum block for datafile was 117765, whereas we had error for block 188654.

03:19:07 SQL> conn scott

Enter password: 

Connected.

03:19:20 SQL> purge recyclebin;

03:23:39 SQL> alter database datafile '+DG_IDX/james/datafile/scott_emp_detail.381.705692005' resize 3800m;

Database altered.

  1* select max(block_id) from dba_extents where file_id=699

03:24:25 SQL> /

MAX(BLOCK_ID)

-------------

       117765

We also ran rman backup validate and dbverify to ensure that there was no corruption

RMAN> backup validate datafile 699;

Starting backup at 11-AUG-10
using target database control file instead of recovery catalog
allocated channel: ORA_DISK_1
channel ORA_DISK_1: sid=2171 instance=james4 devtype=DISK
allocated channel: ORA_DISK_2
channel ORA_DISK_2: sid=2157 instance=james4 devtype=DISK
channel ORA_DISK_1: starting full datafile backupset
channel ORA_DISK_1: specifying datafile(s) in backupset
input datafile fno=00699 name=+DG_IDX/james/datafile/scott_emp_detail.381.705692005
channel ORA_DISK_1: backup set complete, elapsed time: 00:00:35
Finished backup at 11-AUG-10

Since this is ASM, dbv needs you to specify system password

dbv file=+DG_IDX/james/datafile/scott_emp_detail.381.705692005 userid=system/oracle123 blocksize=32768

DBVERIFY - Verification starting : FILE = +DG_IDX/james/datafile/scott_emp_detail.381.705692005
DBVERIFY - Verification complete

Total Pages Examined         : 118400
Total Pages Processed (Data) : 110979
Total Pages Failing   (Data) : 0
Total Pages Processed (Index): 0
Total Pages Failing   (Index): 0
Total Pages Processed (Other): 7421
Total Pages Processed (Seg)  : 0
Total Pages Failing   (Seg)  : 0
Total Pages Empty            : 0
Total Pages Marked Corrupt   : 0
Total Pages Influx           : 0
Highest block SCN            : 0 (0.0)

All this exercise made us re-look at rman backup. We realized that we need to reduce the size of RMAN backup piece as having smaller piece will help us in faster recovery of data blocks or datafile loss.
We had two options

1) Specify value of maxpiecesize so as to limit size of backup piece.
2) Specify filesperset parameter to limit the number of files in backup piece.

Both things meant we had more backup pieces but definitely it would result in improved recovery times. We modified the backup script to include filesperset parameter

BACKUP AS COMPRESSED BACKUPSET INCREMENTAL LEVEL 1 DATABASE FILESPERSET=1

In our case we could reload the data so we dropped the old table. But in case you can’t do it and you have rman backups failing because of corrupt block issue, you can use following syntax in rman run clause to skip ‘n’ corrupt blocks

set maxcorrupt for datafile 699 to 1;