Adding a reinstalled/reimaged node back to 11gR2 Cluster

There could be a scenario of node crash due to OS/hardware issues and a reinstall/reimage of the same. In such cases, just a normal node addition would not  help since the OCR still contain the references of original node. We need to remove them first and then perform a node addition.

Have tried to document one such usecase.

Assumptions :
Cluster Hostnames : node1 , node2
VIP : node1-v , node2-v

– Voting disk and OCR are on ASM( ASMLIB is being used to manage the shared disks )
– After the OS reinstall, user equivalence has been set and all required packages have been installed along with setup of ASMLIB
– The crashed node was node2

STEPS
———-
1. Clearing the OCR entries for re-imaged host.

# crsctl delete node -n node2

To verify the success of above step, execute “olsnodes” on surviving node and the reimaged host shouldnot show up in list.

2. Remove the VIP information of reimaged host from OCR

Execute the following on existing node :
	/u01/grid/11.2/bin/srvctl remove vip -i node2-v -f

3. Clear the inventory for reimaged host for GI and DB Homes.

From the surviving node, execute :

/u01/grid/11.2/oui/bin/runInstaller -updateNodeList ORACLE_HOME=/u01/grid/11.2 "CLUSTER_NODES=node1" CRS=TRUE -silent -local

Perform the similar for Database Home as well :

/u01/oracle/product/11.2/oui/bin/runInstaller -updateNodeList ORACLE_HOME=/u01/oracle/product/11.2 CLUSTER_NODES=node1 -silent -local

4. Now starts the actual step of adding node. Run the Run the Cluster Verification Utility

./cluvfy  stage -pre nodeadd -n node2 -verbose

If possible, redirect the output of above to some file so that it can be reviewed and any issues reported can be rectified.

For this case, since the OCR and Voting disk resides on ASM and ASMLIB is in use, the most impacting errors were

ERROR:
PRVF-5449 : Check of Voting Disk location “ORCL:DISK6(ORCL:DISK6)” failed on the following nodes:
node2:No such file or directory

PRVF-5431 : Oracle Cluster Voting Disk configuration check failed

Will explain the impact of this error in the subsequent steps..

5. Run “addNode.sh” from existing node.

[oracle@node1] /u01/grid/11.2/oui/bin% ./addNode.sh -silent "CLUSTER_NEW_NODES={node2}" "CLUSTER_NEW_VIRTUAL_HOSTNAMES={node2-v}"
[oracle@node1] /u01/grid/11.2/oui/bin%

In my case, the above command came out without giving any messages. Actually the addNode.sh didnot run at all.

Cause : Since ASMLIB is in use, we had hit the issue discussed in MOS Note : 1267569.1
The error seen in step 4 helped in finding this.

Solution :

Set the following parameters and run addNode.sh again.

IGNORE_PREADDNODE_CHECKS=Y
export IGNORE_PREADDNODE_CHECKS

[oracle@node1] /u01/grid/11.2/oui/bin% ./addNode.sh -silent "CLUSTER_NEW_NODES={node2}" "CLUSTER_NEW_VIRTUAL_HOSTNAMES={node2-v}"
Starting Oracle Universal Installer...

Checking swap space: must be greater than 500 MB.   Actual 12143 MB

Performing tests to see whether nodes node2 are available
............................................................... 100% Done.

Cluster Node Addition Summary
Global Settings
   Source: /u01/grid/11.2
   New Nodes
Space Requirements
   New Nodes
      node2

	Instantiating scripts for add node (Tuesday, December 21, 2010 3:35:16 AM PST)
			.                                                                 1% Done.
			Instantiation of add node scripts complete

			Copying to remote nodes (Tuesday, December 21, 2010 3:35:18 AM PST)
			...............................................................................................                                 96% Done.
			Home copied to new nodes

			Saving inventory on nodes (Tuesday, December 21, 2010 3:37:57 AM PST)
			.                                                               100% Done.
			Save inventory complete
			WARNING:
			The following configuration scripts need to be executed as the "root" user in each cluster node.
			/u01/grid/11.2/root.sh # On nodes node2
			To execute the configuration scripts:
			    1. Open a terminal window
			    2. Log in as "root"
			    3. Run the scripts in each cluster node

			The Cluster Node Addition of /u01/grid/11.2 was successful.
			Please check '/tmp/silentInstall.log' for more details.

6. Run root.sh on reimaged node to start up CRS stack.

This will completed Grid Infrastucture setup on the node.

7. Proceed to run addNode.sh for DB Home( on existing Node)

/u01/oracle/product/11.2/addNode.sh -silent "CLUSTER_NEW_NODES={node2}"

8. Once the DB Home addition is complete, use srvctl to check the status of registered DB and instances and add them if required.