CRS-4640 Error on Starting 11gR2 clusterware

I was working on a issue where in Clusterware was not coming up as private interface was down. Following errors were recorded in ocssd.log which informed that issue was with private interface

2011-08-31 15:03:38.051: [ CSSD][1090451776]clssnmvDHBValidateNCopy: node 2, testrac2, has a disk HB, but no network HB, DHB has rcfg 205815745, wrtcnt, 4418998, LATS 4634324, lastSeqNo 4418997, uniqueness 1314797539, timestamp 1314803017/4632384

Checking status of crs informed that the OHASD process was up and running but CRS,CSSD and EVMD processes were not running.

[[email protected] cssd]# /oragrid/product/11.2/bin/crsctl check crs
CRS-4638: Oracle High Availability Services is online
CRS-4535: Cannot communicate with Cluster Ready Services
CRS-4530: Communications failure contacting Cluster Synchronization Services daemon
CRS-4534: Cannot communicate with Event Manager

After fixing the interface issue, we tried starting CRS with ‘crsctl start crs‘ command and it failed with following errors

[[email protected] cssd]# /oragrid/product/11.2/bin/crsctl start crs
CRS-4640: Oracle High Availability Services is already active
CRS-4000: Command Start failed, or completed with errors.

CRS-4640 is reported since OHASD is already running. In 11.2 OHASD is supposed to start the other dependent processes.

crsctl stop crs command failed

[[email protected] cssd]# /oragrid/product/11.2/bin/crsctl stop crs
CRS-2796: The command may not proceed when Cluster Ready Services is not running
CRS-4687: Shutdown command has completed with errors.
CRS-4000: Command Stop failed, or completed with errors.

Since ohasd was already running, I tried crsctl start cluster (this command requires ohasd to be up), and this command succeeded

[[email protected] cssd]# /oragrid/product/11.2/bin/crsctl start cluster
CRS-2672: Attempting to start 'ora.cssd' on 'testrac1'
CRS-2676: Start of 'ora.cssd' on 'testrac1' succeeded
CRS-2672: Attempting to start 'ora.cluster_interconnect.haip' on 'testrac1'
CRS-2672: Attempting to start 'ora.ctssd' on 'testrac1'
CRS-2676: Start of 'ora.ctssd' on 'testrac1' succeeded
CRS-2672: Attempting to start 'ora.crsd' on 'testrac1'
CRS-2672: Attempting to start 'ora.evmd' on 'testrac1'
CRS-2676: Start of 'ora.crsd' on 'testrac1' succeeded
CRS-5702: Resource 'ora.crsd' is already running on 'testrac1'
CRS-2676: Start of 'ora.evmd' on 'testrac1' succeeded
CRS-2676: Start of 'ora.cluster_interconnect.haip' on 'testrac1' succeeded
CRS-5702: Resource 'ora.cluster_interconnect.haip' is already running on 'testrac1'
CRS-4000: Command Start failed, or completed with errors.

[[email protected] ~]# /oragrid/product/11.2/bin/crsctl check crs
CRS-4638: Oracle High Availability Services is online
CRS-4537: Cluster Ready Services is online
CRS-4529: Cluster Synchronization Services is online
CRS-4533: Event Manager is online

Ideally crsctl start crs should be used to start the Clusterware components. But in case they fail to come up due to some issue (e.g voting disk inaccessible,interface issue) and you are in situation when ohasd is up then you can use crsctl start cluster to start the remaining clusterware processes after fixing underlying issue.I believe crsctl stop crs -f option can also be used, though I didn’t try it for this issue.

Amit Bansal

Experienced professional with 16 years of expertise in database technologies. In-depth knowledge of designing and implementation of Disaster Recovery / HA solutions, Database Migrations , performance tuning and creating technical solutions. Skills: Oracle,MySQL, PostgreSQL, Aurora, AWS, Redshift, Hadoop (Cloudera) , Elasticsearch, Python

This Post Has 2 Comments

  1. Zaman

    Great sharing of this experience……

  2. Phil X

    Thanks ! Just got us out of a hole

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.