RAC

Creating ASM devices on AIX

I thought of sharing few tips on creating ASM devices on AIX which I will be helpful to Oracle DBA’s. Suppose SysAdmin gives you list of Serial numbers for LUN instead of device Name

pcmpath query device
DEV#:  33  DEVICE NAME: hdisk33  TYPE: 2107900  ALGORITHM:  Load Balance
SERIAL: 75DM011<span style="color: #ff0000;"><strong>1101</strong></span>
===========================================================================
Path#      Adapter/Path Name          State     Mode     Select     Errors
    0           fscsi0/path0          CLOSE   NORMAL          9          0
    1           fscsi1/path1          CLOSE   NORMAL          8          0

In case there are lot many disks, then it could be a tiring task of running above command and finding each device. You can use below code which will list name of  devices and size (In MB) of disk.

for i in 1000 1100    1018    1118    1030    1130    104C    114C    1068    1168    1080    1180
do
j=`pcmpath query device|grep -p $i"$"|grep DEVICE|awk -F ":" '{print }'|awk '{print }`
k=`bootinfo -s $j`
echo $i $j $k
done

This would return following output

1000 hdisk4 65536
1100 hdisk10 65536
1018 hdisk5 65536
1118 hdisk11 65536
1030 hdisk6 65536
1130 hdisk12 65536
104C hdisk7 65536
114C hdisk13 65536
1068 hdisk8 65536
1168 hdisk14 65536
1080 hdisk9 65536
1180 hdisk15 65536

Now if you need to create new device name, you need to use mknod command and pass on major and minor numbers. Following code can be used to perform same

#export m=0
 # for i in hdisk4  hdisk10 hdisk5  hdisk11 hdisk6  hdisk12 hdisk7  hdisk13 hdisk8  hdisk14 hdisk9  hdisk15
 do
 j=`ls -la /dev/$i |awk '{print }'|awk -F "," '{print }'`
  k=`ls -la /dev/$i |awk '{print }'`
 m=`expr $m + 1` ;echo "mknod /dev/asm_disk"$m "c "$j $k
 done

 mknod /dev/asm_disk1 c 21 4
 mknod /dev/asm_disk2 c 21 12
 mknod /dev/asm_disk3 c 21 13
 mknod /dev/asm_disk4 c 21 15
 mknod /dev/asm_disk5 c 21 5
 mknod /dev/asm_disk6 c 21 6
 mknod /dev/asm_disk7 c 21 8
 mknod /dev/asm_disk8 c 21 7
 mknod /dev/asm_disk9 c 21 14
 mknod /dev/asm_disk10 c 21 10
 mknod /dev/asm_disk11 c 21 9
 mknod /dev/asm_disk12 c 21 11

Now you can change the ownership to oracle:dba and permission to 660. I have 12 disks , so using list of 12 variables. In case you have more disks , then you can add more variables

# for i in 1 2 3 4 5 6 7 8 9 10 11 12
do
chown oracle:dba /dev/asm_disk$i
chmod 660 /dev/asm_disk$i
done

crw-rw----    1 oracle   dba          21, 11 Jan 28 17:10 /dev/asm_disk12
crw-rw----    1 oracle   dba          21,  9 Jan 28 17:10 /dev/asm_disk11
crw-rw----    1 oracle   dba          21, 10 Jan 28 17:10 /dev/asm_disk10
crw-rw----    1 oracle   dba          21, 14 Jan 28 17:04 /dev/asm_disk9
crw-rw----    1 oracle   dba          21,  7 Jan 28 17:04 /dev/asm_disk8
crw-rw----    1 oracle   dba          21,  8 Jan 28 17:04 /dev/asm_disk7
crw-rw----    1 oracle   dba          21,  6 Jan 28 17:04 /dev/asm_disk6
crw-rw----    1 oracle   dba          21,  5 Jan 28 17:04 /dev/asm_disk5
crw-rw----    1 oracle   dba          21, 15 Jan 28 17:04 /dev/asm_disk4
crw-rw----    1 oracle   dba          21, 13 Jan 28 17:04 /dev/asm_disk3
crw-rw----    1 oracle   dba          21, 12 Jan 28 17:04 /dev/asm_disk2
crw-rw----    1 oracle   dba          21,  4 Jan 28 17:04 /dev/asm_disk1

In case you need to use same logic for creating OCR and Voting disks on RAC system, replace /dev/asm with /dev/ocr or /dev/voting . I hope this would save some time and also prevent errors 🙂

In case you have disks in ordered number,say 53 to 62 then you can also use for loop as below.

#bash
bash-3.00#
#export m=0
#for ((i=53;i<=62;i++))
do
 j=`ls -la /dev/hdisk$i |awk '{print }'|awk -F "," '{print }'`
  k=`ls -la /dev/hdisk$i |awk '{print }'`
   m=`expr $m + 1` ;echo "mknod /dev/asm_disk"$m "c "$j $k
 done
#for ((i=1;i<=10;i++))
do
chown oracle:dba /dev/asm_disk$i
chmod 660 /dev/asm_disk$i
done

I would suggest anyone using the scripts to first check in a test environment.

CRS Fails to Start – 10.2.0.1 RAC Install on AIX

I was installing 10.2.0.1 on IBM AIX 5L and while running root.sh from first node (as part of Clusterware installation) got following messages

Now formatting voting device: /dev/voting_disk01
Now formatting voting device: /dev/voting_disk02
Now formatting voting device: /dev/voting_disk03
Format of 3 voting devices complete.
Startup will be queued to init within 30 seconds

I waited for quite some time and found that it was stuck. To check what was status of CSS, I did a grep for CSS and found that it was running /etc/init.cssd startcheck css script. This indicated that Oracle was stuck trying to start CSS. Following errors were recorded in /tmp/crsct.7459

Failure in CSS initialization opening OCR.

Metalink notes suggested checking OCR Disk permission , though in my case they had correct permissions i.e ownership as oracle:dba and permission set to 660. To diagnose further, I checked $ORA_CRS_HOME/log to check for errors. All the logfiles related to CRS,CSS and EVMD are stored in $ORA_CRS_HOME/log/<hostname>.

/oracle/crs_base/app/product/crs10gR2/log>ls -ltr
total 0
drwxrwx---    2 oracle   dba             256 Jan 28 18:46 crs
drwx------    3 root     system          256 Jan 28 18:53 chd0196
drwxr-xr-t    8 root     dba             256 Jan 28 18:53 rac01

Hostname for the server was rac01 and not chd0196. This was a new server and also directories could not be present earlier as it was a fresh installation.  Oracle was picking two hostname which was quite strange. I checked for HACMP filesets  and found that they were present

# lslpp -l |grep -i hacmp
  rsct.basic.hacmp           2.4.9.0  COMMITTED  RSCT Basic Function (HACMP/ES
  rsct.compat.basic.hacmp    2.4.9.0  COMMITTED  RSCT Event Management Basic
                                                 Function (HACMP/ES Support)
  rsct.compat.clients.hacmp  2.4.9.0  COMMITTED  RSCT Event Management Client
                                                 Function (HACMP/ES Support)

10g RAC does not require Vendor clusterware as Oracle provides it own clusterware called “Oracle Clusterware”.We got these packages un-installed and got both server rebooted. After cleaning up RAC installation, we restarted installation . You can use  Metalink Note 239998.1 – 10g RAC: How to Clean Up After a Failed CRS Install for cleanup.  On re-running root.sh installation, installation went fine.

Creating Oracle Extended RAC on Oracle VM

Yesterday, I found one very useful article at OTN “Creating Oracle Extended RAC” on completely virtual environment using Oracle VM. As Virtualization is becoming popular day by day and is very cost effective, one must know how to use this to simulate actual environments. Click  here for details on Oracle Extended RAC on Oracle VM.

10g RAC Tuning :Useful Link

I came across useful link which contains 10g RAC Tuning tips from Joel Goodman (Oracle). It discusses RAC Instance Recovery concepts and key things to look out for while tuning RAC (Real Application Clusters) Databases. Must watch for anyone who is learning RAC.

You can find the recorded webcast here. You can also become member of oracleracsig to find many such webcasts/resources on RAC.

How To Recover From Corrupted OCR Disk

It is very common where a DBA is left with corrupted OCR disk without having any good backup.
The same situation was experienced by me few days back. One node of RAC database shows the following:

NODE1:

<span style="font-family: arial,helvetica,sans-serif;"><strong>$ORA_CRS_HOME/bin/crs_stat -t
</strong>Name           Type           Target    State     Host
------------------------------------------------------------
ora.orcl.db    application    ONLINE    ONLINE    rac1
ora....11.inst application    ONLINE    ONLINE    rac1
ora....12.inst application    ONLINE    OFFLINE
ora....vice.cs application    OFFLINE   OFFLINE
ora....l11.srv application    ONLINE    OFFLINE
ora....l12.srv application    ONLINE    OFFLINE
ora....SM1.asm application    ONLINE    ONLINE    rac1
ora....DC.lsnr application    ONLINE    ONLINE    rac1
ora....abc.gsd application    ONLINE    ONLINE    rac1
ora....abc.ons application    ONLINE    ONLINE    rac1
ora....abc.vip application    ONLINE    ONLINE    rac1
ora....SM2.asm application    ONLINE    ONLINE    rac2
ora....C2.lsnr application    ONLINE    ONLINE    rac2
ora....bc2.gsd application    ONLINE    ONLINE    rac2
ora....bc2.ons application    ONLINE    ONLINE    rac2
ora....bc2.vip application    ONLINE    ONLINE    rac2</span>

The other node shows the following:
NODE2:

<span style="font-family: arial,helvetica,sans-serif;"><strong>/crs_stat -t</strong>
HA Resource                                   Target     State
-----------                                   ------     -----
ora.orcl.db                                   OFFLINE    OFFLINE
ora.orcl.orcl11.inst                          OFFLINE    OFFLINE
ora.orcl.orcl12.inst                          OFFLINE    OFFLINE
ora.orcl.test_service.cs                      ONLINE     OFFLINE
ora.orcl.test_service.orcl11.srv              OFFLINE    OFFLINE
ora.orcl.test_service.orcl12.srv              OFFLINE    OFFLINE
ora.rac1 .ASM1.asm                         OFFLINE    OFFLINE
ora.rac1 .LISTENER_RAC1 .lsnr           OFFLINE    OFFLINE
ora.rac1 .gsd                              OFFLINE    OFFLINE
ora.rac1 .ons                              OFFLINE    OFFLINE
ora.rac1 .vip                              OFFLINE    OFFLINE
ora.rac2.ASM2.asm                        OFFLINE    OFFLINE
ora.rac2.LISTENER_RAC2 2.lsnr         ONLINE     OFFLINE
ora.rac2.gsd                             ONLINE     OFFLINE
ora.rac2.ons                             ONLINE     OFFLINE
ora.rac2.vip                             ONLINE     OFFLINE</span>

We can see the inconsistent data across two node RAC. Every command for srvctl, crsctl was hanging on NODE 2.
Now the option is to restore the OCR backup, but if there is no backup available for OCR then we can use the following procedure to recover from corrupted OCR disk
(There will be complete downtime needed to perform these operations)


1. Check the status of CRS from node 1:

# ps -eaf |grep d.bin
root 12873 1 0 Aug11 ? 00:11:07 /u01/app/crs/bin/crsd.bin reboot
oracle 13105 12846 0 Aug11 ? 00:00:45 /u01/app/crs/bin/evmd.bin
oracle 13226 13200 0 Aug11 ? 00:13:13 /u01/app/crs/bin/ocssd.bin
root 21458 19986 0 20:34 pts/4 00:00:00 grep d.bin

2. Shutdown Oracle ClusterWare on all nodes:

<span style="font-family: arial,helvetica,sans-serif;">[root@rac1  bin]# ./crsctl stop crs
Stopping resources.
Successfully stopped CRS resources
Stopping CSSD.
Shutting down CSS daemon.
Shutdown request successfully issued.</span>

Check the status again:

[root@rac1 bin]# ps -eaf |grep d.bin
root 21927 19986 0 20:34 pts/4 00:00:00 grep d.bin

It shows that the cluster is stopped.

3. Execute rootdelete.sh from all nodes.

It is under directory $ORA_CRS_HOME/install/rootdelete.sh

NODE1:

<span style="font-family: arial,helvetica,sans-serif;">[root@rac1  install]# <strong>./rootdelete.sh</strong>
Shutting down Oracle Cluster Ready Services (CRS):
Stopping resources.
Error while stopping resources. Possible cause: CRSD is down.
Stopping CSSD.
Unable to communicate with the CSS daemon.
Shutdown has begun. The daemons should exit soon.
Checking to see if Oracle CRS stack is down...
Oracle CRS stack is not running.
Oracle CRS stack is down now.
Removing script for Oracle Cluster Ready services
Updating ocr file for downgrade
Cleaning up SCR settings in '/etc/oracle/scls_scr'</span>

NODE 2:

./rootdelete.sh</strong>
Shutting down Oracle Cluster Ready Services (CRS):
OCR initialization failed accessing OCR device: PROC-26: Error while accessing the physical storage Operating System error [No such file or directory] [2]
Shutdown has begun. The daemons should exit soon.
Checking to see if Oracle CRS stack is down...
Oracle CRS stack is not running.
Oracle CRS stack is down now.
Removing script for Oracle Cluster Ready services
Updating ocr file for downgrade
Cleaning up SCR settings in '/etc/oracle/scls_scr'</span>

“OCR initialization failed accessing OCR device”, this error can occur due to folloing reasons:
1. ocrconfig_loc is not pointing to the correct ocr.
2. Problem of rights and owners on the ocr devices
3. Configuration problem on Oracle Cluster Synchronization Services

As the SCR entries are cleaned up so there is no need to worry about PROC-26 error.

If you have more than 2 nodes in a rac you need to run rootdelete.sh on all the other nodes also.

4. Run rootdeinstall.sh from the node where the RAC installation was done (usually it is the node1).
It will clear up the OCR disk contents.

<span style="font-family: arial,helvetica,sans-serif;">./rootdeinstall.sh</span>

<span style="font-family: arial,helvetica,sans-serif;">Removing contents from OCR device
2560+0 records in
2560+0 records out</span>

5. Run root.sh from the same node:

<span style="font-family: arial,helvetica,sans-serif;">./root.sh
WARNING: directory '/u01' is not owned by root
Checking to see if Oracle CRS stack is already configured</span>

<span style="font-family: arial,helvetica,sans-serif;">Setting the permissions on OCR backup directory
Setting up NS directories
Oracle Cluster Registry configuration upgraded successfully
WARNING: directory '/u01' is not owned by root
assigning default hostname rac1  for node 1.
assigning default hostname rac2 2 for node 2.
Successfully accumulated necessary OCR keys.
Using ports: CSS=49895 CRS=49896 EVMC=49898 and EVMR=49897.
node :</span>

node 1: rac1  rac1-priv rac1
node 2: rac2  rac2-priv rac2
Creating OCR keys for user 'root', privgrp 'root'..
Operation successful.
Now formatting voting device: /dev/raw/raw1
Format of 1 voting devices complete.
Startup will be queued to init within 90 seconds.
Adding daemons to inittab
Expecting the CRS daemons to be up within 600 seconds.
CSS is active on these nodes.
rac1
CSS is inactive on these nodes.
rac2 2
Local node checking complete.
Run root.sh on remaining nodes to start CRS daemons.

After its completion run root.sh on all remaining nodes.

<span style="font-family: arial,helvetica,sans-serif;"> ./root.sh
Checking to see if Oracle CRS stack is already configured</span>

<span style="font-family: arial,helvetica,sans-serif;">Setting the permissions on OCR backup directory
Setting up NS directories
Oracle Cluster Registry configuration upgraded successfully
clscfg: EXISTING configuration version 3 detected.
clscfg: version 3 is 10G Release 2.
assigning default hostname rac1  for node 1.
assigning default hostname rac2  for node 2.
Successfully accumulated necessary OCR keys.
Using ports: CSS=49895 CRS=49896 EVMC=49898 and EVMR=49897.
node :</span>

node 1: rac1  rac1-priv rac1
node 2: rac2  rac2-priv rac2
clscfg: Arguments check out successfully.

<span style="font-family: arial,helvetica,sans-serif;">NO KEYS WERE WRITTEN. Supply -force parameter to override.
-force is destructive and will destroy any previous cluster
configuration.
Oracle Cluster Registry for cluster has already been initialized
Startup will be queued to init within 90 seconds.
Adding daemons to inittab
Expecting the CRS daemons to be up within 600 seconds.
CSS is active on these nodes.
rac1
rac2
CSS is active on all nodes.
Oracle CRS stack installed and running under init(1M)
Running vipca(silent) for configuring nodeapps
The given interface(s), "eth0" is not public. Public interfaces should be used to configure virtual IPs.</span>

The silent mode VIPCA configuration will fail because of BUG 4437727 in 10.2.0.1. To solve this run the
VIPCA manually from root user from last node where this error has occured and follow the instructions.
# $ORA_CRS_HOME/bin/vipca

6. Now final step is to add the resources back to OCR with srvctl command.

Adding DATABASE to OCR:

$srvctl add database -d db_unique_name -o oracle_home
[oracle@rac1 ~]$ $ORA_CRS_HOME/bin/srvctl add database -d orcl -o /u01/app/oracle/product/10.2.0/db_1</span>

Adding INSTANCE to OCR:

srvctl add instance -d db_unique_name -i inst_name -n node_name
[oracle@rac1 ~]$ $ORA_CRS_HOME/bin/srvctl add instance -d orcl -i orcl11 -n rac1
[oracle@rac1 ~]$ $ORA_CRS_HOME/bin/srvctl add instance -d orcl -i orcl12 -n rac2 2</span>

Adding SERVICES to OCR:

$srvctl add service -d db_unique_name -s service_name -r preferred_list
[oracle@rac1  ~]$ $ORA_CRS_HOME/bin/srvctl add service -d orcl -s test_service -r orcl11,orcl12</span>

Adding NODEAPPS to OCR:

srvctl add nodeapps -n node_name -o oracle_home -A addr_str
Where addr_str= The node level VIP address
This command needs to be run from ROOT user otherwise you will get following error:

<span style="font-family: arial,helvetica,sans-serif;">[oracle@rac1  ~]$  $ORA_CRS_HOME/bin/srvctl add nodeapps -n rac1  -o /u01/app/oracle/product/10.2.0/db_1 -A 10.167.21.89/255.255.255.0
PRKO-2117 : This command should be executed as the system privilege user.
[oracle@rac1  ~]$
[oracle@rac1  ~]$ su -
Password:
[root@rac1  ~]# cd /u01/app/crs/bin
[root@rac1  bin]# ./srvctl add nodeapps -n rac1  -o /u01/app/oracle/product/10.2.0/db_1 -A 10.167.21.87/255.255.255.0
[root@rac1  bin]#./srvctl add nodeapps -n rac2 2  -o /u01/app/oracle/product/10.2.0/db_1 -A 10.167.21.89/255.255.255.0</span>

This will complete the OCR recreation, now you can test the status with cluvfy.

Issues with CLUSTER_DATABASE parameter

Yesterday, I faced an interesting scenario while upgrading 2 Node RAC Database. I tried setting up CLUSTER_DATABASE=FALSE in spfile from Node 1,but it displayed value as TRUE after restarting database.Same was true for Database startup entries in Alert log. If I did the same setting in Node 2 and started database from node 2, it started in shared mode.I was using shared spfile (on OCFS) for both the systems.

CLUSTER_DATABASE is a Real Application Clusters parameter that specifies whether or not Real Application Clusters is enabled. It is mostly used for starting database in Exclusive mode during operations which will require updating dictionary. e.g Upgrading database,Enabling Archivelog and also for changing Database characterset.

I had ignored this error  few times but today I wanted to find the cause and resolve it. To diagnose further, I set CLUSTER_DATABASE=FALSE from Node 1 with following command

<span style="font-size: small; font-family: courier new,courier;">ALTER SYSTEM SET CLUSTER_DATABASE=FALSE SCOPE=SPFILE;
</span>

Then I used srvctl to start the database

<span style="font-size: small; font-family: courier new,courier;">[oracle@blrraclnx1 bdump]$ srvctl start database -d orcl
</span>

Node 1 Alert log had CLUSTER_DATABASE=TRUE

<span style="font-size: small; font-family: courier new,courier;">cluster_database         = TRUE
cluster_database_instances= 2
db_create_file_dest      = +DATA
db_recovery_file_dest    = +FRA
db_recovery_file_dest_size= 4294967296
thread                   = 1
instance_number          = 1
</span>

But on checking Alert Log from Node 2, I found CLUSTER_DATABASE=FALSE

<span style="font-size: small; font-family: courier new,courier;">cluster_database         = FALSE
cluster_database_instances= 1
db_create_file_dest      = +DATA
db_recovery_file_dest    = +FRA
db_recovery_file_dest_size= 4294967296
thread                   = 2
instance_number          = 2
</span>

Moreover Instance 2 had crashed with following errors

  <span style="font-size: small; font-family: courier new,courier;">Tue Sep  2 21:10:14 2008
lmon registered with NM - instance id 2 (internal mem no 1)
 Warning: cluster_database_instances (1) is &lt;= my node id (1)
    This instance wants to mount exclusive when instance 0 has mounted shared.  Exiting!
Tue Sep  2 21:10:15 2008
USER: terminating instance due to error 29707
Instance terminated by USER, pid = 2807
</span>

Error says that Instance 0 (Node 1) has already mounted in SHARED Mode and this instance (Node 2) wants to start in Exclusive Mode.

To check the values I decided to use V$SPPARAMETER view. Till now I was using show parameter cluster_database command to check the values.

<span style="font-size: small; font-family: courier new,courier;">SQL&gt; SELECT  SID,NAME,VALUE,DISPLAY_VALUE FROM V$SPPARAMETER WHERE NAME='cluster_database';

SID        NAME                           VALUE           DISPLAY_VALUE
---------- ------------------------------ --------------- ---------------
orcl1      cluster_database               TRUE            FALSE
*          cluster_database               FALSE            FALSE
</span>

This is strange!! I had never specified any specific value for node 1. Anyways to resolve it, I used reset command to clear the orcl1 entry from spfile

<span style="font-size: small; font-family: courier new,courier;">SQL&gt; alter system reset cluster_database sid='orcl1';

System altered.
</span>

Then I re-checked the V$SPPARAMETER view

<span style="font-size: small; font-family: courier new,courier;">SQL&gt; SELECT SID,NAME,VALUE FROM V$SPPARAMETER WHERE NAME='cluster_database';

SID        NAME                VALUE
-------- ------------------ --------
*        cluster_database      FALSE
</span>

Restarting the database next time allowed it to be started in Exclusive mode. Issue was resolved but question was “Why was CLUSTER_DATABASE variable different for Node 1”
I remember that as part of setup I had created Node 1 and added Node 2 later. There could be some missed steps there or it could be that this parameter was set explicitly with sid=’orcl1′ option.I really had no clue on why it was like that.If anyone has experienced this, then do let me know.