cluvfy

11gR2: CVU needs SCAN Listeners’ Manual Startup During Grid-Infrastructure Configuration

During configuration phase of Grid Infrastructure for cluster, CVU failed while performing post-checks.
Following message is displaced in the Installation Log file:

<span style="font-size: small;"><span style="font-family: arial,helvetica,sans-serif;">Checking Single Client Access Name (SCAN)...</span></span>

<span style="font-size: small;"><span style="font-family: arial,helvetica,sans-serif;">WARNING:
<strong>PRVF-5056</strong> : Scan Listener "LISTENER" not running</span></span>

<span style="font-size: small;"><span style="font-family: arial,helvetica,sans-serif;">Checking name resolution setup for "scan-test.abc.com"...</span></span>

<span style="font-size: small;"><span style="font-family: arial,helvetica,sans-serif;">Verification of SCAN VIP and Listener setup failed</span></span>

The LISTENER from Grid Infrastructure home is running fine:

$ ps -eaf | grep tns

<span style="font-size: small;"><span style="font-family: arial,helvetica,sans-serif;">/u01/app/grid_11.2/GridInfra/bin/tnslsnr LISTENER -inherit</span></span>

Same error is observerd by manually runnig CVU as:

$ cluvfy comp scan

<span style="font-size: small;"><span style="font-family: arial,helvetica,sans-serif;">Verifying scan
Checking Single Client Access Name (SCAN)...</span></span>

<span style="font-size: small;"><span style="font-family: arial,helvetica,sans-serif;">WARNING:
<strong>PRVF-5056 </strong>: Scan Listener "LISTENER" not running</span></span>

<span style="font-size: small;"><span style="font-family: arial,helvetica,sans-serif;">Checking name resolution setup for "scan-test.abc.com"...
Verification of SCAN VIP and Listener setup failed
Verification of scan was unsuccessful on all the specified nodes.</span></span>

Checking the status of SCAN using SRVCTL gives correct results:

$ srvctl config scan

SCAN name: scan-test.abc.com, Network: 1/165.101.124.0/255.255.254.0/eth0
<span style="font-size: small;">SCAN VIP name: scan1, IP: /scan-test.abc.com/165.101.125.204
SCAN VIP name: scan2, IP: /scan-test.abc.com/165.101.125.235
SCAN VIP name: scan3, IP: /scan-test.abc.com/165.101.125.180</span>

Which shows that the SCAN is configured properly.

When I tried to start the SCAN Listener manually as:

$ srvctl start scan_listener

It showed the following results:

node1:
====

<span style="font-size: small;">oracle 379 1 0 07:53 ? 00:00:00 /u01/app/grid_11.2/GridInfra/bin/tnslsnr LISTENER_SCAN1 -inherit
oracle 401 1 0 07:53 ? 00:00:00 /u01/app/grid_11.2/GridInfra/bin/tnslsnr LISTENER_SCAN3 -inherit</span>

node2:
====

<span style="font-size: small;"><span style="font-family: arial,helvetica,sans-serif;">oracle 11726 1 0 07:53 ? 00:00:00 /u01/app/grid_11.2/GridInfra/bin/tnslsnr LISTENER_SCAN2 -inherit</span></span>

$ srvctl config scan_listener

<span style="font-size: small;">SCAN Listener LISTENER_SCAN1 exists. Port: TCP:1521
SCAN Listener LISTENER_SCAN2 exists. Port: TCP:1521
SCAN Listener LISTENER_SCAN3 exists. Port: TCP:1521</span>

After this, I clicked at the retry button for the CVU post checks and this time it succeeded.
Though it worked fine after this, but still not sure why I have to manually start the SCAN-Listeners !!!!!

Cheers!!!!

Cluvfy Reports CRS not Installed on Nodes

Before I start my post, I would like to wish all our readers a “Happy New Year”.

While reviewing a 10gR2 RAC configuration I faced following errors on invoking cluvfy utility

$ ./cluvfy stage -post crsinst -n prod01,prod02 -verbose

Performing post-checks for cluster services setup 

Checking node reachability...

Check: Node reachability from node "prod01-fe"
  Destination Node                      Reachable?
  ------------------------------------  ------------------------
  prod01                              yes
  prod02                              yes
Result: Node reachability check passed from node "prod01-fe".

Checking user equivalence...

Check: User equivalence for user "oracle"
  Node Name                             Comment
  ------------------------------------  ------------------------
  prod02                              passed
  prod01                              passed
Result: User equivalence check passed for user "oracle".

ERROR:
CRS is not installed on any of the nodes.
Verification cannot proceed.

Post-check for cluster services setup was unsuccessful on all the nodes.

Cluvfy has reported that Clusterware has not been installed on the server. But this was strange and unexpected as one RAC Database was already running on these nodes and crs_stat reported the status of all CRS resources. ocrcheck also did not report any problems with the OCR. To dig further I checked Cluvfy logs located under $ORA_CRS_HOME/cv/log. It had recorded following errors

[main] [15:52:41:800] [OUIData.readInventoryData:393]  ==== CRS home added: Oracle home properties:
Name     : OraCRS
Type     : CRS-HOME
Location : /app/oracle/product/10.2/crs
Node list: [prod01-fe, prod02-fe]
; Thu Dec 24 15:52:41 GMT+08:00 2009
[main] [15:52:41:800] [OUIData.readInventoryData:401]  ==== ORACLE home added: Oracle home properties:
Name     : OraHome
Type     : ORACLE-HOME
Location : /app/oracle/product/10.2/db_1
Node list: [prod01, prod02]
; Thu Dec 24 15:52:41 GMT+08:00 2009
[main] [15:52:41:800] [VerificationUtil.isCRSInstalled:1262]  CRS wasn't found installed  on node: prod02; Thu Dec 24 15:52:41
 GMT+08:00 2009
[main] [15:52:41:800] [VerificationUtil.isCRSInstalled:1262]  CRS wasn't found installed  on node: prod01; Thu Dec 24 15:52:41
 GMT+08:00 2009

You can notice that CRS_HOME has recorded Node list as [prod01-fe, prod02-fe] and for DB_HOME it is [prod01, prod02]. Actual hostname for the nodes are prod01 and prod02. prod01-fe and prod02-fe were aliases for the two nodes.

Cluvfy uses Central Oracle Inventory to obtain ORACLE_HOME information. Checking the $ORACLE_BASE/oraInventory/ContentsXML/inventory.xml confirmed that the nodes corresponding to CRS_HOME were stored as prod01-fe and prod02-fe ( I have changed the Angled brackets to Round brackets as html considers them as tags. Don’t worry your inventory file is not corrupted 🙂 )

(INVENTORY)
(VERSION_INFO)
   (SAVED_WITH)10.2.0.1.0(/SAVED_WITH)
   (MINIMUM_VER)2.1.0.6.0(/MINIMUM_VER)
(/VERSION_INFO)
(HOME_LIST)
(HOME NAME="OraCRS" LOC="/app/oracle/product/10.2/crs" TYPE="O" IDX="1" CRS="true")
   (NODE_LIST)
      (NODE NAME="prod01-fe"/)
      (NODE NAME="prod02-fe"/)
   (/NODE_LIST)
(/HOME)
(HOME NAME="OraHome" LOC="/app/oracle/product/10.2/db_1" TYPE="O" IDX="2")
   (NODE_LIST)
      (NODE NAME="prod01"/)
      (NODE NAME="prod02"/)
   (/NODE_LIST)
(/HOME)

There are few reported issues on metalink but they relate to CRS=”true” not being present for CRS_HOME entry. There is one more reported issue listed here by Surachart which was caused by incorrect permissions on /etc/oraInst.loc file.

To correct this problem, we had to run runInstaller -updateNodelist command to update the correct nodes in file. Even though some metalink notes recommend to change this file manually,  I would recommend using runInstaller command for updating inventory.
Before executing the command, I ran olsnodes command to confirm that OCR stored prod01,prod02 in its repository.

runInstaller -updateNodeList -silent "CLUSTER_NODES={prod01,prod02}" ORACLE_HOME="/app/oracle/product/10.2/crs" ORACLE_HOME_NAME="OraCRS" LOCAL_NODE="prod01" CRS=true
Starting Oracle Universal Installer...

No pre-requisite checks found in oraparam.ini, no system pre-requisite checks will be executed.
The inventory pointer is located at /var/opt/oracle/oraInst.loc
The inventory is located at /app/oracle/oraInventory
'UpdateNodeList' was successful.

Now the cluvfy worked fine. I do not have any reasoning for why the incorrect nodes were recorded in Oracle Inventory, but above solution should take care of this.

Verification of CRS Integrity Was Unsuccessful

While going through the routine checks from Grid Control, I found a critical alert stating “clusterware integrity check failed” and by clicking on this message it says that there is problem with some metric collections on RAC environment.

To check the node reachability status following query was run:

$ $CRS_HOME/bin/cluvfy comp nodecon -n all

This will check the internode connectivity for all nodes in the cluster. It came out with following message:

$ $CRS_HOME/bin/cluvfy comp nodecon -n all
Verifying node connectivity
Verification of node connectivity was unsuccessful on all the nodes.

Even the CRS component check was unsuccessful:

$ $CRS_HOME/bin/cluvfy comp crs -n all

It came out with the following message:

$ $CRS_HOME/bin/cluvfy comp crs -n all
Verifying CRS integrity
Verification of CRS integrity was unsuccessful on all the nodes.

After this it was quite obvious to check the CRS status:

$ crsctl check crs
CSS appears healthy
CRS appears healthy
EVM appears healthy
$crs_stat -t

Name           Type           Target    State     Host
------------------------------------------------------------
ora.orcl.db    application    ONLINE    ONLINE    rac1
ora....11.inst application    ONLINE    ONLINE    rac1
ora....12.inst application    ONLINE    ONLINE    rac2
ora....vice.cs application    ONLINE    ONLINE    rac2
ora....l1.srv application    ONLINE    ONLINE    rac1
ora....l1.srv application    ONLINE    ONLINE    rac2
ora....SM1.asm application    ONLINE    ONLINE    rac1
ora....DC.lsnr application    ONLINE    ONLINE    rac1
ora....idc.gsd application    ONLINE    ONLINE    rac1
ora....idc.ons application    ONLINE    ONLINE    rac1
ora....idc.vip application    ONLINE    ONLINE    rac1
ora....SM2.asm application    ONLINE    ONLINE    rac2
ora....C2.lsnr application    ONLINE    ONLINE    rac2
ora....dc2.gsd application    ONLINE    ONLINE    rac2
ora....dc2.ons application    ONLINE    ONLINE    rac2
ora....dc2.vip application    ONLINE    ONLINE    rac2
$$CRS_HOME/bin/olsnodes
rac1
rac2

This confirmed that the CRS install is valid, but the question now is why the cluster verification utility (CVU) was failing?

To find the reason I enabled the tracing of CVU as:

$export SRVM_TRACE=true

It will set the environment variable SRVM_TRACE to true and tracing of CVU will generate a trace file under $CRS_HOME/cv/log with name like “cvutrace.log.X”

After setting this and again running $CRS_HOME/bin/cluvfy comp crs -n all trace file with name cvutrace.log.0 was generated.

And a message in cvutrace.log like

<strong>"ksh: CVU_10.2.0.2_dba/exectask.sh: cannot execute"</strong>

Now its is clear that oracle is not able to execute exectask.sh and cheking the permission and ownership of exectask.sh:

$CRS_HOME/cv/remenv
ls -ltr
-rw-r--r--  1 oracle dba    184 Jan  9  2008 exectask.sh
-rw-r--r--  1 oracle dba 268386 Jan  9  2008 exectask

The permission of these two files was changed. After changing the permission back to 755 CUV was showing correct results.

$chmod 755 exectask*

It is still not discovered how the permission of these files got changed.