oracle

Issues with CLUSTER_DATABASE parameter

Yesterday, I faced an interesting scenario while upgrading 2 Node RAC Database. I tried setting up CLUSTER_DATABASE=FALSE in spfile from Node 1,but it displayed value as TRUE after restarting database.Same was true for Database startup entries in Alert log. If I did the same setting in Node 2 and started database from node 2, it started in shared mode.I was using shared spfile (on OCFS) for both the systems.

CLUSTER_DATABASE is a Real Application Clusters parameter that specifies whether or not Real Application Clusters is enabled. It is mostly used for starting database in Exclusive mode during operations which will require updating dictionary. e.g Upgrading database,Enabling Archivelog and also for changing Database characterset.

I had ignored this error  few times but today I wanted to find the cause and resolve it. To diagnose further, I set CLUSTER_DATABASE=FALSE from Node 1 with following command

<span style="font-size: small; font-family: courier new,courier;">ALTER SYSTEM SET CLUSTER_DATABASE=FALSE SCOPE=SPFILE;
</span>

Then I used srvctl to start the database

<span style="font-size: small; font-family: courier new,courier;">[oracle@blrraclnx1 bdump]$ srvctl start database -d orcl
</span>

Node 1 Alert log had CLUSTER_DATABASE=TRUE

<span style="font-size: small; font-family: courier new,courier;">cluster_database         = TRUE
cluster_database_instances= 2
db_create_file_dest      = +DATA
db_recovery_file_dest    = +FRA
db_recovery_file_dest_size= 4294967296
thread                   = 1
instance_number          = 1
</span>

But on checking Alert Log from Node 2, I found CLUSTER_DATABASE=FALSE

<span style="font-size: small; font-family: courier new,courier;">cluster_database         = FALSE
cluster_database_instances= 1
db_create_file_dest      = +DATA
db_recovery_file_dest    = +FRA
db_recovery_file_dest_size= 4294967296
thread                   = 2
instance_number          = 2
</span>

Moreover Instance 2 had crashed with following errors

  <span style="font-size: small; font-family: courier new,courier;">Tue Sep  2 21:10:14 2008
lmon registered with NM - instance id 2 (internal mem no 1)
 Warning: cluster_database_instances (1) is &lt;= my node id (1)
    This instance wants to mount exclusive when instance 0 has mounted shared.  Exiting!
Tue Sep  2 21:10:15 2008
USER: terminating instance due to error 29707
Instance terminated by USER, pid = 2807
</span>

Error says that Instance 0 (Node 1) has already mounted in SHARED Mode and this instance (Node 2) wants to start in Exclusive Mode.

To check the values I decided to use V$SPPARAMETER view. Till now I was using show parameter cluster_database command to check the values.

<span style="font-size: small; font-family: courier new,courier;">SQL&gt; SELECT  SID,NAME,VALUE,DISPLAY_VALUE FROM V$SPPARAMETER WHERE NAME='cluster_database';

SID        NAME                           VALUE           DISPLAY_VALUE
---------- ------------------------------ --------------- ---------------
orcl1      cluster_database               TRUE            FALSE
*          cluster_database               FALSE            FALSE
</span>

This is strange!! I had never specified any specific value for node 1. Anyways to resolve it, I used reset command to clear the orcl1 entry from spfile

<span style="font-size: small; font-family: courier new,courier;">SQL&gt; alter system reset cluster_database sid='orcl1';

System altered.
</span>

Then I re-checked the V$SPPARAMETER view

<span style="font-size: small; font-family: courier new,courier;">SQL&gt; SELECT SID,NAME,VALUE FROM V$SPPARAMETER WHERE NAME='cluster_database';

SID        NAME                VALUE
-------- ------------------ --------
*        cluster_database      FALSE
</span>

Restarting the database next time allowed it to be started in Exclusive mode. Issue was resolved but question was “Why was CLUSTER_DATABASE variable different for Node 1”
I remember that as part of setup I had created Node 1 and added Node 2 later. There could be some missed steps there or it could be that this parameter was set explicitly with sid=’orcl1′ option.I really had no clue on why it was like that.If anyone has experienced this, then do let me know.

UNKNOWN State Of RAC Resources

While Checking the status of database resources, ASM was shown as UNKNOWN on one node of a two node RAC.

$ crs_stat -t

Name           Type           Target    State     Host
------------------------------------------------------------
ora.orcl.db    application    ONLINE    ONLINE    rac1
ora....11.inst application    ONLINE    ONLINE    rac1
ora....SM1.asm application    ONLINE    ONLINE    rac1
ora....DC.lsnr application    ONLINE    ONLINE    rac1
ora....idc.gsd application    ONLINE    ONLINE    rac1
ora....idc.ons application    ONLINE    ONLINE    rac1
ora....idc.vip application    ONLINE    ONLINE    rac1
ora....SM2.asm application    ONLINE    UNKNOWN    rac2
ora....C2.lsnr application    ONLINE    ONLINE    rac2
ora....dc2.gsd application    ONLINE    ONLINE    rac2
ora....dc2.ons application    ONLINE    ONLINE    rac2
ora....dc2.vip application    ONLINE    ONLINE    rac2

Following error was coming while trying to start the +ASM2 instance with SRVCTL:

$srvctl start asm -n rac2

PRKS-1009 : Failed to start ASM instance "+ASM2" on node "rac2",
[CRS-0223: Resource 'ora.rac2.ASM2.asm' has placement error.]

While trying to start the same with crs_start :

$ crs_start -f ora.rac2.ASM2.asm

CRS-1028: Dependency analysis failed because of:
'Resource in UNKNOWN state: ora.rac2.ASM2.asm'
CRS-0223: Resource 'ora.rac2.ASM2.asm' has placement error

There are two ways to come out of this UNKNOWN state of resources:
1. Start the resource from sqlplus
2. Use crs_stop -f to clear the state of database resources.

$ export ORACLE_HOME=+ASM2
$ sqlplus "/ as sysdba"
SQL>startup 
Diskgroup mounted

It will go fine and the +ASM2 instnace will be started.

$ crs_stop -f ora.rac2.ASM2.asm

This will clear the UNKNOWN state and will make the resource as OFFLINE.

Now start the resource as:

$ srvctl start asm -n rac2

After using this check the status :

$ crs_stat -t

In case of listener resource, if starting listener using srvctl results in following error

CRS-0215: Could not start resource 'ora.dev-101.LISTENER_DEV-101.lsnr'.

This can be resolved by removing listener resource and adding it back. Perform following action using root user

#crs_unregister ora.dev-101.LISTENER_DEV-101.lsnr
#crs_unregister ora.dev-102.LISTENER_DEV-102.lsnr

Then recreate the listener using silent mode as oracle user

$netca /silent /responsefile $ORACLE_HOME/network/install/netca_typ.rsp /nodeinfo dev-101,dev-102

Above command can result in error like below

Exception in thread "main" java.lang.UnsatisfiedLinkError: /home/oracle/product/10.2/jdk/jre/lib/i386/libawt.so: libXp.so.6: cannot open shared object file: No such file or directory
	at java.lang.ClassLoader$NativeLibrary.load(Native Method)
	at java.lang.ClassLoader.loadLibrary0(ClassLoader.java:1586)
	at java.lang.ClassLoader.loadLibrary(ClassLoader.java:1503)
	at java.lang.Runtime.loadLibrary0(Runtime.java:788)
	at java.lang.System.loadLibrary(System.java:834)
	at sun.security.action.LoadLibraryAction.run(LoadLibraryAction.java:50)
	at java.security.AccessController.doPrivileged(Native Method)
	at sun.awt.NativeLibLoader.loadLibraries(NativeLibLoader.java:38)
	at sun.awt.DebugHelper.(DebugHelper.java:29)
	at java.awt.Component.(Component.java:506)

This can be resolved by installing xorg-x11-deprecated-libs rpm. (yum install xorg-x11-deprecated-libs)

Using RDA As RDBMS Pre-Install Check Tool

Many of us would have come across RDA (Remote Diagnostic Agent) while working on a ticket with Oracle support. In case you have not heard about it, I would recommend to go through Metalink Note:314422.1 – Remote Diagnostic Agent (RDA) 4 – Getting Started

RDA captures System Information such as OS,Hardware Details (like number of CPU and amount of RAM),OS error log,OS Monitoring tool output (like vmstat,TOP,etc). This can be handy in case you do not know the command or the location of the OS logs.Similary you can find Database version,Database Patch inventory,Database Alert log and trace files.

This can help save lot of time as you need not remember all OS commands to capture the information.
Similarly RDA also collects Database Performance Statistics like OS Statistics (CPU,Memory and Disk I/O Stats) along with TOP SQL, Locking and Latch statistics. In case of 10g, it generates AWR Report (60 mins) and ADDM report based on captured Snapshots. All this information can be helpful for diagnosing a Performance Problem.

There is one more use of RDA which not many people are aware of. i.e RDA Health Check / Validation Engine (HCVE). HCVE Engine can be used to perform Pre-Install checks for Oracle Database and Oracle Application server on Unix system (At time of writing this article, this functionality is not available on windows)


To run this , you need to execute rda.sh -T hcve e.g I need to validate if I can install Oracle 10gR2 on my OEL4 (Linux x86).

$ ./rda.sh -T hcve
Processing HCVE tests ...
Available Pre-Installation Rule Sets:
   1. Oracle Database 10g R1 (10.1.0) PreInstall (Linux-x86)
   2. Oracle Database 10g R1 (10.1.0) PreInstall (Linux AMD64)
   3. Oracle Database 10g R1 (10.1.0) PreInstall (IA-64 Linux)
   4. Oracle Database 10g R2 (10.2.0) PreInstall (Linux AMD64)
   5. Oracle Database 10g R2 (10.2.0) PreInstall (IA-64 Linux)
   6. Oracle Database 10g R2 (10.2.0) PreInstall (Linux-x86)
   7. Oracle Database 11g R1 (11.1.0) PreInstall (Linux AMD64)
   8. Oracle Database 11g R1 (11.1.0) PreInstall (Linux-x86)
   9. Oracle Application Server 10g (9.0.4) PreInstall (Linux)
  10. Oracle Application Server 10g R2 (10.1.2) PreInstall (Linux)
  11. Oracle Application Server 10g R3 (10.1.3) PreInstall (Linux AMD64)
  12. Oracle Application Server 10g R3 (10.1.3) PreInstall (IA-64 Linux)
  13. Oracle Application Server 10g R3 (10.1.3) PreInstall (Linux-x86)
  14. Oracle Portal PreInstall (Generic)
Available Post-Installation Rule Sets:
  15. Oracle Portal PostInstall (generic)
  16. RAC 10G DB and OS Best Practices (Linux)
  17. Data Guard PostInstall (Generic)
Enter the HCVE rule set number
Hit 'Return' to accept the default (1)
<strong>&gt; 6</strong>

Enter value for &lt; Planned ORACLE_HOME location or if set &gt;
Hit 'Return' to accept the default ($ORACLE_HOME)
<strong>&gt; /u01/app/oracle</strong>

Test "Oracle Database 10g R2 (10.2.0) PreInstall (Linux-x86)" executed at Wed Aug 27 15:12:18 2008

Test Results
~~~~~~~~~~~~

   ID NAME                 RESULT VALUE
===== ==================== ====== ========================================
   10 OS Certified?        PASSED Adequate
   20 User in /etc/passwd? PASSED userOK
   30 Group in /etc/group? PASSED GroupOK
   40 Input ORACLE_HOME    RECORD /u01/app/oracle
   50 ORACLE_HOME Valid?   PASSED OHexists
   60 O_H Permissions OK?  PASSED CorrectPerms
   70 Umask Set to 022?    PASSED UmaskOK
   80 LDLIBRARYPATH Unset? FAILED IsSet
  100 Other O_Hs in PATH?  FAILED OratabEntryInPath
  110 oraInventory Permiss PASSED oraInventoryOK
  120 /tmp Adequate?       PASSED TempSpaceOK
  130 Swap (in MB)         RECORD 1051
  140 RAM (in MB)          FAILED 1001
  150 Swap OK?             FAILED InsufficientSwap
  160 Disk Space OK?       PASSED DiskSpaceOK
  170 Kernel Parameters OK PASSED KernelOK
  180 Got ld,nm,ar,make?   PASSED ld_nm_ar_make_found
  190 ulimits OK?          FAILED StackTooSmall MaxLockMemTooSmall
  200 EL4 RPMs OK?         PASSED EL4rpmsOK
  204 RHEL3 RPMs OK?       PASSED NotRedHat
  205 RHEL4 RPMs OK?       PASSED NotRedHat
  209 SUSE SLES9 RPMs OK?  PASSED NotSuSE
  212 Patch 3006854 Instal PASSED NotRHEL3
  214 ip_local_port_range  PASSED ip_local_port_rangeOK
  220 Tainted Kernel?      PASSED NotVerifiable
  230 Other OUI Up?        PASSED NoOtherOUI
Result file: /home/oracle/rda/output/RDA_HCVE_A201DB10R2_lnx_res.htm

I also tried out option “RAC 10G DB and OS Best Practices (Linux)” which is part of Post Install but for some reason some of the components failed.

Enter the HCVE rule set number
Hit 'Return' to accept the default (1)
&gt; 16

Enter the password for 'SYSTEM':
Please re-enter it to confirm:

Test "RAC 10G DB and OS Best Practices (Linux)" executed at Wed Aug 27 17:26:33 2008

Test Results
~~~~~~~~~~~~

   ID NAME                 RESULT VALUE
===== ==================== ====== ========================================
   10 ORA_CRS_HOME         RECORD /u01/app/crs
  100 Database Name        RECORD orcl
  102 Database Version     RECORD 10.2.0.4.0
  104 Interconnect Network RECORD
  106 DB Block Size        RECORD 8192
  108 DB File Multiblock R RECORD 16
  120 Max Commit Propagati PASSED 0
  130 SYS.AUDSES$ Cache Si PASSED 10000
  132 SYS.IDGEN1$ Cache Si FAILED 20
<strong>  140 Parallel Execution M FAILED 2148</strong>
  150 Min Parallel Servers RECORD 1
  152 Min Parallel Servers FAILED 0
  200 $ORA_CRS_HOME Define PASSED Found
  210 Remote Access        PASSED All loaded
<strong>  220 _USR_ORA_DEBUG / CRS FAILED blrraclnx1:? blrraclnx2:?
  230 _USR_ORA_DEBUG / ORA FAILED blrraclnx1:? blrraclnx2:?</strong>
  240 rmem_max             PASSED OK
  250 UDP Buffer Size      PASSED OK
  260 wmem_max             PASSED OK
  270 rmem_default         PASSED OK
  280 wmem_default         PASSED OK
  290 Sysrq Magic Keys     PASSED OK
  300 Oracle Executable Li PASSED linked
<strong>  310 hangcheck-timer      FAILED blrraclnx1:Unknown blrraclnx2:Unknown
  320 aio-max-size Setting FAILED blrraclnx1:Unknown blrraclnx2:Unknown</strong>
  330 Memory (32-bit)      PASSED OK
<strong>  340 Swap (32-bit)        FAILED [blrraclnx1:]Swap&lt;2RAM [blrraclnx2:]S..&gt;</strong>
  350 Swap (64-bit)        PASSED OK
  360 Patch List           PASSED Complete
Result file: /home/oracle/rda/output/RDA_HCVE_P400RAC_lnx_res.htm

We can find details about prescribed values at

https://metalink.oracle.com/metalink/plsql/docs/HCVE_P400RAC_lnx.htm

e.g To fix SYS.IDGEN1$ Cache Size, we need to set cache size for sequence SYS.IDGEN1$ greater than or equal to 10,000.

SQL> alter sequence SYS.IDGEN1$ cache 10200;

Sequence altered.

Now we see that SYS.IDGEN1$ requirement is passed

<strong>132 SYS.IDGEN1$ Cache Si PASSED 10200</strong>

Refer Note:250262.1 – RDA 4 – Health Check / Validation Engine Guide for more information on HCVE.

Checking Database Feature Usage Stats

Today I came across view DBA_FEATURE_USAGE_STATISTICS (10g) which let’s us know whether particular Database Feature has been used till now or not. If yes, then it also let’s us know when it was used for the first time and also the last usage time. This can be helpful to check if anyone is using Database feature which is not licensed .

I have used filter DETECTED_USAGES >0 for checking the features which has been used in this Database.

<span style="font-size: x-small; font-family: helvetica;"><span style="font-size: small;">SQL&gt;  SELECT NAME,DETECTED_USAGES AS "USAGE",CURRENTLY_USED,FIRST_USAGE_DATE,LAST_USAGE_DATE
  2  FROM DBA_FEATURE_USAGE_STATISTICS WHERE DETECTED_USAGES &gt;0 order by 1;

NAME                                               USAGE CURRE FIRST_USA LAST_USAG
--------------------------------------------- ---------- ----- --------- ---------
Automatic SQL Execution Memory                         2 TRUE  14-AUG-08 21-AUG-08
Automatic SQL Execution Memory                         1 TRUE  13-AUG-08 13-AUG-08
Automatic Segment Space Management (system)            2 TRUE  14-AUG-08 21-AUG-08
Automatic Segment Space Management (system)            1 TRUE  13-AUG-08 13-AUG-08
Automatic Segment Space Management (user)              1 TRUE  13-AUG-08 13-AUG-08
Automatic Storage Manager                              2 TRUE  14-AUG-08 21-AUG-08
Automatic Storage Manager                              1 TRUE  13-AUG-08 13-AUG-08
Automatic Undo Management                              2 TRUE  14-AUG-08 21-AUG-08
Automatic Undo Management                              1 TRUE  13-AUG-08 13-AUG-08
Character Set                                          2 TRUE  14-AUG-08 21-AUG-08
Character Set                                          1 TRUE  13-AUG-08 13-AUG-08
Dynamic SGA                                            1 TRUE  13-AUG-08 13-AUG-08
Internode Parallel Execution                           2 TRUE  14-AUG-08 21-AUG-08
Locally Managed Tablespaces (system)                   1 TRUE  13-AUG-08 13-AUG-08
Locally Managed Tablespaces (system)                   2 TRUE  14-AUG-08 21-AUG-08
Locally Managed Tablespaces (user)                     1 TRUE  13-AUG-08 13-AUG-08
Locally Managed Tablespaces (user)                     2 TRUE  14-AUG-08 21-AUG-08
Parallel SQL Query Execution                           2 TRUE  14-AUG-08 21-AUG-08
Partitioning (system)                                  1 TRUE  13-AUG-08 13-AUG-08
Partitioning (system)                                  2 TRUE  14-AUG-08 21-AUG-08
Protection Mode - Maximum Performance                  1 TRUE  13-AUG-08 13-AUG-08
Protection Mode - Maximum Performance                  2 TRUE  14-AUG-08 21-AUG-08
Real Application Clusters (RAC)                        2 TRUE  14-AUG-08 21-AUG-08
Real Application Clusters (RAC)                        1 TRUE  13-AUG-08 13-AUG-08
Recovery Area                                          1 TRUE  13-AUG-08 13-AUG-08
Recovery Area                                          2 TRUE  14-AUG-08 21-AUG-08
Segment Advisor                                        2 TRUE  14-AUG-08 21-AUG-08
Server Parameter File                                  2 TRUE  14-AUG-08 21-AUG-08
Server Parameter File                                  1 TRUE  13-AUG-08 13-AUG-08
Streams (system)                                       2 TRUE  14-AUG-08 21-AUG-08
Streams (system)                                       1 TRUE  13-AUG-08 13-AUG-08
Streams (user)                                         1 TRUE  13-AUG-08 13-AUG-08
Streams (user)                                         2 TRUE  14-AUG-08 21-AUG-08
Virtual Private Database (VPD)                         2 TRUE  14-AUG-08 21-AUG-08
Virtual Private Database (VPD)                         1 TRUE  13-AUG-08 13-AUG-08
XDB                                                    2 TRUE  14-AUG-08 21-AUG-08

36 rows selected.

</span></span>

So be careful when you use any Licensed feature (Unless you have already bought it) like Partitioning, AWR,Database Replay as this auditing is enabled by default 🙂

Upgrading Oracle RAC Database -10g

Continuing my experiments with our 2 Node 10g RAC Test system, I carried out upgrade of Oracle Clusterware and Oracle RAC Database from 10.2.0.1 to 10.2.0.4. I have tried to document the steps for upgrading Oracle Clusterware(Rolling Upgrade) and RAC Database upgrade in this post. In case you observe any mistakes, please let me know

First step is to download the 10.2.0.4 Patchset from metalink. In our case ,we downloaded Patch 6810189 (10g Release 2 (10.2.0.4) Patch Set 3 for Linux x86). You can follow Patch Readme for detailed steps.

We will be doing Rolling upgrade for Oracle Clusterware i.e we will only bring one node down for patching while other node will be available and accepting database connections. Before you start the process, take backup of following so as to restore it in case of failed upgrade

a) Full OS backup (as some binaries are present in /etc ,etc)

b) Full Database Backup (Cold or hot backup)

c) Backup of OCR and voting disk

Let’s begin it

1)Shutdown the DBconsole and Isqlplus

<span style="font-size: small; font-family: arial,helvetica,sans-serif;">$ emctl stop dbconsole
$ isqlplusctl stop
</span>

2) Shutdown the associated service on the node

<span style="font-size: small; font-family: arial,helvetica,sans-serif;">[oracle@blrraclnx1 ~]$ srvctl stop service -d orcl -s orcl_taf -i orcl1</span>

3) Shutdown Database Instance and ASM instance on node (if present)

<span style="font-size: small; font-family: arial,helvetica,sans-serif;">[oracle@blrraclnx1 ~]$  srvctl stop instance -d orcl -i orcl1
</span>

To stop ASM, use following command

<span style="font-size: small; font-family: arial,helvetica,sans-serif;">[oracle@blrraclnx1 ~]$ srvctl stop asm -n blrraclnx1
</span>

4)Next step is to stop Nodeapps services on the node

<span style="font-size: small; font-family: arial,helvetica,sans-serif;">[oracle@blrraclnx1 ~]$ srvctl stop nodeapps -n blrraclnx1</span>

Before proceeding to installing Oracle Clusterware Patch, let’s confirm if services have been stopped

HA Resource                                   Target     State
-----------                                   ------     -----
<strong>ora.blrraclnx1.ASM1.asm                       OFFLINE    OFFLINE
ora.blrraclnx1.LISTENER1_BLRRACLNX1.lsnr      OFFLINE    OFFLINE
ora.blrraclnx1.gsd                            OFFLINE    OFFLINE
ora.blrraclnx1.ons                            OFFLINE    OFFLINE
ora.blrraclnx1.vip                            OFFLINE    OFFLINE</strong>
ora.blrraclnx2.ASM2.asm                       ONLINE     ONLINE on blrraclnx2
ora.blrraclnx2.LISTENER1_BLRRACLNX2.lsnr      ONLINE     ONLINE on blrraclnx2
ora.blrraclnx2.gsd                            ONLINE     ONLINE on blrraclnx2
ora.blrraclnx2.ons                            ONLINE     ONLINE on blrraclnx2
ora.blrraclnx2.vip                            ONLINE     ONLINE on blrraclnx2
ora.orcl.db                                   ONLINE     ONLINE on blrraclnx2
<strong>ora.orcl.orcl1.inst                           OFFLINE    OFFLINE</strong>
ora.orcl.orcl2.inst                           ONLINE     ONLINE on blrraclnx2
ora.orcl.orcl_taf.cs                          ONLINE     ONLINE on blrraclnx2
<strong>ora.orcl.orcl_taf.orcl1.srv                   OFFLINE    OFFLINE</strong>
ora.orcl.orcl_taf.orcl2.srv                   ONLINE     ONLINE on blrraclnx2

5)Set DISPLAY variable and execute runinstaller from Patch Directory

<span style="font-size: small; font-family: arial,helvetica,sans-serif;">[oracle@blrraclnx1 Disk1]$ ./runInstaller
</span>

This will open OUI screen. Select Oracle Clusterware Home for Patching. Find below screenshot for same

crs10204patch

crs10204patch

This will automatically select all the nodes available in cluster and propogate patch binaries to the other node.

10204patch2

10204patch2


6) On the Summary screen, click Install.OUI will prompt you now to run, following two scripts as Root which will upgrade Oracle Clusterware

<span style="font-size: small; font-family: arial,helvetica,sans-serif;"># $ORA_CRS_home/bin/crsctl stop crs
# $ORA_CRS_home/install/root102.sh
</span>

Now we need to repeat the steps 1-4 and step 6 on Node 2. Step 5 is not required as binaries have been already copied over to node 2.

RAC Database Patching cannot be done in a rolling fashion and requires Database to be shutdown.

1)Shutdown the DBconsole and Isqlplus

<span style="font-size: small; font-family: arial,helvetica,sans-serif;">$ emctl stop dbconsole
$ isqlplusctl stop

</span>

2) Shutdown the associated service for database

<span style="font-size: small; font-family: arial,helvetica,sans-serif;">[oracle@blrraclnx1 ~]$ srvctl stop service -d orcl </span>

3) Shutdown Database Instance and ASM instance on node (if present)

<span style="font-size: small; font-family: arial,helvetica,sans-serif;">[oracle@blrraclnx1 ~]$  srvctl stop database -d orcl
</span>

To stop ASM, use following command on both the nodes

&lt;span style=&quot;font-size: small; font-family: arial,helvetica,sans-serif;&quot;&gt;[oracle@blrraclnx1 ~]$ srvctl stop asm -n blrraclnx1
&lt;/span&gt;<span style=\"font-size: small; font-family: arial,helvetica,sans-serif;\">[oracle@blrraclnx1 ~]$ srvctl stop asm -n blrraclnx2</span>

4)Next step is to stop Listener on both the nodes

<span style="font-size: small; font-family: arial,helvetica,sans-serif;">[oracle@blrraclnx1 ~]$ srvctl stop listener -n blrraclnx1 -l LISTENER1_BLRRACLNX1
</span><span style="font-size: small; font-family: arial,helvetica,sans-serif;">[oracle@blrraclnx1 ~]$ srvctl stop </span><span style="font-size: small; font-family: arial,helvetica,sans-serif;">listener</span><span style="font-size: small; font-family: arial,helvetica,sans-serif;"> -n blrraclnx2 -l LISTENER1_BLRRACLNX2
</span>

5)Set DISPLAY variable and execute runinstaller from Patch Directory

<span style="font-size: small; font-family: arial,helvetica,sans-serif;">[oracle@blrraclnx1 Disk1]$ ./runInstaller
</span>

This will open OUI screen. Select Database Home for Patching.


6) On the Summary screen, click Install.When prompted, run the $ORACLE_HOME/root.sh script as the root
user on both the nodes. On completion of this , we need to perform post installation steps.

7)Start listener and ASM Instance on both the nodes

<span style="font-size: small; font-family: arial,helvetica,sans-serif;">[oracle@blrraclnx1 ~]$ srvctl start listener -n blrraclnx1 -l LISTENER1_BLRRACLNX1
[oracle@blrraclnx1 ~]$ srvctl start listener -n blrraclnx2 -l LISTENER1_BLRRACLNX2
[oracle@blrraclnx1 ~]$ srvctl start asm -n blrraclnx1
[oracle@blrraclnx1 ~]$ srvctl start asm -n blrraclnx2</span>

8)For Oracle RAC Installation, we need to set CLUSTER_DATABASE=FALSE before upgrading

<span style="font-size: small; font-family: arial,helvetica,sans-serif;">[oracle@blrraclnx1 ~]sqlplus "/ as sysdba"
SQL&gt;startup nomount
SQL&gt; alter system set cluster_database=false scope=spfile;

System altered.
SQL&gt;shutdown immediate;
SQL&gt;startup upgrade
SQL&gt;spool 10204patch.log
SQL&gt;@?/rdbms/admin/catupgrd.sql
SQL&gt;spool off</span>

Log file needs to be reviewed for any errors. catupgrd.sql took 42 minutes on my system. In case CLUSTER_DATABASE parameter is not set to False, you will get following error while starting database in upgrade mode

ORA-39701: database must be mounted EXCLUSIVE for UPGRADE or DOWNGRADE

We need to Restart the database now and run utlrp.sql.

<span style="font-size: small; font-family: arial,helvetica,sans-serif;">SQL&gt; SHUTDOWN IMMEDIATE
SQL&gt; STARTUP
SQL&gt; @?/rdbms/admin/utlrp.sql</span>

Confirm whether Database has been upgraded successfully by querying DBA_REGISTRY;

select comp_name,version,status from dba_registry;

Now set Cluster_database parameter to TRUE and start Database

<span style="font-size: small; font-family: arial,helvetica,sans-serif;">SQL&gt;alter system set cluster_database=true scope=spfile;
SQL&gt;Shutdown immediate;
[oracle@blrraclnx1 ~]$ srvctl start database -d orcl
[oracle@blrraclnx1 ~]$ srvctl start service -d orcl</span>

To upgrade DBConsole, run following command

<span style="font-size: small; font-family: arial,helvetica,sans-serif;">emca -upgrade db -cluster
</span>

This completes the upgrade process.

Verification of CRS Integrity Was Unsuccessful

While going through the routine checks from Grid Control, I found a critical alert stating “clusterware integrity check failed” and by clicking on this message it says that there is problem with some metric collections on RAC environment.

To check the node reachability status following query was run:

$ $CRS_HOME/bin/cluvfy comp nodecon -n all

This will check the internode connectivity for all nodes in the cluster. It came out with following message:

$ $CRS_HOME/bin/cluvfy comp nodecon -n all
Verifying node connectivity
Verification of node connectivity was unsuccessful on all the nodes.

Even the CRS component check was unsuccessful:

$ $CRS_HOME/bin/cluvfy comp crs -n all

It came out with the following message:

$ $CRS_HOME/bin/cluvfy comp crs -n all
Verifying CRS integrity
Verification of CRS integrity was unsuccessful on all the nodes.

After this it was quite obvious to check the CRS status:

$ crsctl check crs
CSS appears healthy
CRS appears healthy
EVM appears healthy
$crs_stat -t

Name           Type           Target    State     Host
------------------------------------------------------------
ora.orcl.db    application    ONLINE    ONLINE    rac1
ora....11.inst application    ONLINE    ONLINE    rac1
ora....12.inst application    ONLINE    ONLINE    rac2
ora....vice.cs application    ONLINE    ONLINE    rac2
ora....l1.srv application    ONLINE    ONLINE    rac1
ora....l1.srv application    ONLINE    ONLINE    rac2
ora....SM1.asm application    ONLINE    ONLINE    rac1
ora....DC.lsnr application    ONLINE    ONLINE    rac1
ora....idc.gsd application    ONLINE    ONLINE    rac1
ora....idc.ons application    ONLINE    ONLINE    rac1
ora....idc.vip application    ONLINE    ONLINE    rac1
ora....SM2.asm application    ONLINE    ONLINE    rac2
ora....C2.lsnr application    ONLINE    ONLINE    rac2
ora....dc2.gsd application    ONLINE    ONLINE    rac2
ora....dc2.ons application    ONLINE    ONLINE    rac2
ora....dc2.vip application    ONLINE    ONLINE    rac2
$$CRS_HOME/bin/olsnodes
rac1
rac2

This confirmed that the CRS install is valid, but the question now is why the cluster verification utility (CVU) was failing?

To find the reason I enabled the tracing of CVU as:

$export SRVM_TRACE=true

It will set the environment variable SRVM_TRACE to true and tracing of CVU will generate a trace file under $CRS_HOME/cv/log with name like “cvutrace.log.X”

After setting this and again running $CRS_HOME/bin/cluvfy comp crs -n all trace file with name cvutrace.log.0 was generated.

And a message in cvutrace.log like

<strong>"ksh: CVU_10.2.0.2_dba/exectask.sh: cannot execute"</strong>

Now its is clear that oracle is not able to execute exectask.sh and cheking the permission and ownership of exectask.sh:

$CRS_HOME/cv/remenv
ls -ltr
-rw-r--r--  1 oracle dba    184 Jan  9  2008 exectask.sh
-rw-r--r--  1 oracle dba 268386 Jan  9  2008 exectask

The permission of these two files was changed. After changing the permission back to 755 CUV was showing correct results.

$chmod 755 exectask*

It is still not discovered how the permission of these files got changed.