Recently we came across a issue where our DB crashed with ORA-00338 error .
Errors in file /oracle/diag/rdbms/orcl11g/orc11g/trace/orc11g_lgwr_24118.trc:
ORA-00338: log 2 of thread 1 is more recent than control file
ORA-00312: online log 2 thread 1: '/oracle/oradata/orcl11g/redo02.log'
LGWR (ospid: 24118): terminating the instance due to error 338
DB couldn’t be restarted as it gave same errors while opening. For multiplexed redo log files, it reports error for both log files.
Error Description: (Reference :http://psoug.org/oraerror/ORA-00338.htm)
Log string of thread string is more recent than control file
Error Cause:
The control file change sequence number in the log file is greater than the number in the control file. This implies that the wrong control file is being used. Note that repeatedly causing this error can make it stop happening without correcting the real problem. Every attempt to open the database will advance the control file change sequence number until it is great enough.
Action:
Use the current control file or do backup control file recovery to make the control file current. Be sure to follow all restrictions on doing a backup control file recovery.
Above explanation suggests that there is problem with controlfile. Normally these errors are seen when doing incomplete recovery. To troubleshoot it ,we took dumps of redo log files (On recommendation of Oracle support)
Note: Below logs are from test system which was used to reproduce the issue
SQL> alter system dump logfile '/oracle/oradata/orcl11g/redo01.log' validate;
System altered.
SQL> alter system dump logfile '/oracle/oradata/orcl11g/redo02.log' validate;
ERROR at line 1:
ORA-00339: archived log does not contain any redo
ORA-00334: archived log: '/oracle/oradata/orcl11g/redo02.log'
SQL> alter system dump logfile '/oracle/oradata/orcl11g/redo03.log' validate;
*
ERROR at line 1:
ORA-00339: archived log does not contain any redo
ORA-00334: archived log: '/oracle/oradata/orcl11g/redo03.log'
redo02.log and redo03.log dump failed with errors that it does not contain redo. Since redo01.log dump was successful, we looked at trace.
DUMP OF REDO FROM FILE '/oracle/oradata/orcl11g/redo01.log'
Opcodes *.*
RBAs: 0x000000.00000000.0000 thru 0xffffffff.ffffffff.ffff
SCNs: scn: 0x0000.00000000 thru scn: 0xffff.ffffffff
Times: creation thru eternity
VALIDATE ONLY
FILE HEADER:
Compatibility Vsn = 186647552=0xb200400
Db ID=970369526=0x39d6a9f6, Db Name='TESTDB'
Activation ID=2650290266=0x9df8385a
Control Seq=5124=0x1404, File size=102400=0x19000
File Number=1, Blksiz=512, File Type=2 LOG
descrip:"Thread 0001, Seq# 0000000001, SCN 0x00000016f528-0xffffffffffff"
thread: 1 nab: 0xffffffff seq: 0x00000001 hws: 0x3 eot: 1 dis: 0
resetlogs count: 0x33acc28a scn: 0x0000.0016f528 (1504552)
prev resetlogs count: 0x3377bd37 scn: 0x0000.000e2006 (925702)
Low scn: 0x0000.0016f528 (1504552) 12/22/2014 06:13:30
Next scn: 0xffff.ffffffff 01/01/1988 00:00:00
Enabled scn: 0x0000.0016f528 (1504552) 12/22/2014 06:13:30
Thread closed scn: 0x0000.0016f528 (1504552) 12/22/2014 06:13:30
Disk cksum: 0xcec6 Calc cksum: 0xcec6
Terminal recovery stop scn: 0x0000.00000000
Terminal recovery 01/01/1988 00:00:00
Most recent redo scn: 0x0000.00000000
Largest LWN: 0 blocks
End-of-redo stream : No
Unprotected mode
Miscellaneous flags: 0x800000
Thread internal enable indicator: thr: 0, seq: 0 scn: 0x0000.00000000
Zero blocks: 0
Format ID is 2
redo log key is 1679de3ad36cdd2684143daaa1635b8
redo log key flag is 5
Enabled redo threads: 1
END OF REDO DUMP
----- Redo read statistics for thread 1 -----
If you look at dumpfile it says that sequence is set to 1 and DB name=”TESTDB’ .Our instance name is orc11g and last sequence# was more than 1 (can also be confirmed from v$log). This indicated that our redo logs were overwritten by some other process. DB name in redo log hinted that it was done by reporting clone refresh process. Problem was that redo log volume was cross mounted on a reporting clone which overwrote the redo logs during the refresh process .
As a fix, new volume was provisioned for the clone redo logs and volume export was revoked. Since current redo log also got overwritten, we had to restore last hot backup and perform incomplete recovery till the last ETL start time (This was a datawarehouse db). In OLTP this would have caused data loss.
This issue is easily reproducible. If you try to clone a instance by using same name for redo log and open the database, you will receive the error in your source database. DB files are protected by DBWR so we are protected from risk of other Database opening datafiles and get ORA-01157
ORA-01157: cannot identify/lock data file 3 - see DBWR trace file
ORA-01110: data file 3: '/oracle/oradata/orcl11g/undotbs01.dbf'
Recent Comments