While going through the routine checks from Grid Control, I found a critical alert stating “clusterware integrity check failed” and by clicking on this message it says that there is problem with some metric collections on RAC environment.
To check the node reachability status following query was run:
$ $CRS_HOME/bin/cluvfy comp nodecon -n all
This will check the internode connectivity for all nodes in the cluster. It came out with following message:
$ $CRS_HOME/bin/cluvfy comp nodecon -n all Verifying node connectivity Verification of node connectivity was unsuccessful on all the nodes.
Even the CRS component check was unsuccessful:
$ $CRS_HOME/bin/cluvfy comp crs -n all
It came out with the following message:
$ $CRS_HOME/bin/cluvfy comp crs -n all Verifying CRS integrity Verification of CRS integrity was unsuccessful on all the nodes.
After this it was quite obvious to check the CRS status:
$ crsctl check crs CSS appears healthy CRS appears healthy EVM appears healthy
$crs_stat -t Name Type Target State Host ------------------------------------------------------------ ora.orcl.db application ONLINE ONLINE rac1 ora....11.inst application ONLINE ONLINE rac1 ora....12.inst application ONLINE ONLINE rac2 ora....vice.cs application ONLINE ONLINE rac2 ora....l1.srv application ONLINE ONLINE rac1 ora....l1.srv application ONLINE ONLINE rac2 ora....SM1.asm application ONLINE ONLINE rac1 ora....DC.lsnr application ONLINE ONLINE rac1 ora....idc.gsd application ONLINE ONLINE rac1 ora....idc.ons application ONLINE ONLINE rac1 ora....idc.vip application ONLINE ONLINE rac1 ora....SM2.asm application ONLINE ONLINE rac2 ora....C2.lsnr application ONLINE ONLINE rac2 ora....dc2.gsd application ONLINE ONLINE rac2 ora....dc2.ons application ONLINE ONLINE rac2 ora....dc2.vip application ONLINE ONLINE rac2
$$CRS_HOME/bin/olsnodes rac1 rac2
This confirmed that the CRS install is valid, but the question now is why the cluster verification utility (CVU) was failing?
To find the reason I enabled the tracing of CVU as:
$export SRVM_TRACE=true
It will set the environment variable SRVM_TRACE to true and tracing of CVU will generate a trace file under $CRS_HOME/cv/log with name like “cvutrace.log.X”
After setting this and again running $CRS_HOME/bin/cluvfy comp crs -n all trace file with name cvutrace.log.0 was generated.
And a message in cvutrace.log like
<strong>"ksh: CVU_10.2.0.2_dba/exectask.sh: cannot execute"</strong>
Now its is clear that oracle is not able to execute exectask.sh and cheking the permission and ownership of exectask.sh:
$CRS_HOME/cv/remenv ls -ltr -rw-r--r-- 1 oracle dba 184 Jan 9 2008 exectask.sh -rw-r--r-- 1 oracle dba 268386 Jan 9 2008 exectask
The permission of these two files was changed. After changing the permission back to 755 CUV was showing correct results.
$chmod 755 exectask*
It is still not discovered how the permission of these files got changed.
In our case, the permissions of files “exectask” and “exectask.sh” in directory “$ORA_CRS_HOME/cv/remenv” got changed after applying patch bundle 2. I believe there’s a Metalink note on this as well.
Besides that, I’ve since found that ssh banners will also make “cluvfy” to return a unsuccessful verification status. In the trace log of CVU will show ssh banner message(s) as errors.
Adrian,
Thanks for your comment.
As far as I know, patch bundle is not applied at our site, and changing of permission is still a mystery for us.
We have to wait for next occurrence of this issue.
Cheers!!!
Saurabh Sood
Hey Saurabh,
Usefull stuff..!
Regards,
Theetha
Thanks Theetha!!!
Regards,
Saurabh