Saurabh Sood

In-Place Upgrade 11gR2 RAC : 11.2.0.1 To 11.2.0.2

Sharing a post from my friend “Carthik” on 11gR2 RAC in-place upgrade.

Overview
========
Starting oracle 11gR2 the Oracle database and Clusterware upgrades are done via the “Out of place upgrade”. And is the easier way to perform your upgrade.  However, the intent of this blog is to explain how an “In-Place upgrade” of a RAC database is done in 11gR2, the advantages, disadvantages, pre-requisites and pain points involved in this method which is the traditional method of upgrading an oracle database. And I have chosen to upgrade a 2 Node 11.2.0.1 RAC Database to 11.2.0.2 RAC Database.

Clearly, the advantage is that you save space, Instead of installing a new Oracle Home.

The greatest disadvantage is that you need to back up the oracle home and run detach oracle home commands. This could potentially damage your oracle binaries. However, when done carefully it should not cause a problem.

The pain points Include:
1. Backing up the Oracle home
2. Restoring the Oracle Home from backup in-case of installation failure.
3. Attach the restored home, and then bring up the instance.
Clearly, there is a lot of manual intervention, which is a major pain point.

The idea behind using an in-place upgrade is to save space. And this method of upgrade requires a significant amount time. The only time one should use this method is when you lack space on your server. However, you can use this method for your test and development environments to save space. Since this method was the traditional method of doing things, I thought it’s worth checking how it works in 11gR2.

NOTE: If you have an existing Oracle Clusterware installation, then you upgrade your existing cluster by performing an out-of-place upgrade. You cannot perform an in-place upgrade to the oracle Clusterware. However, you can perform an in-place upgrade for the database. This will be elaborated in this blog.
Prerequisites for Oracle 11gR2 11.2.0.2 installation is to install patch 9655006 to the 11.2.0.1 GI home before upgrading to 11.2.0.2 from 11.2.0.1. See Bug 9413827 on MOS. For additional details you can refer to the Metalink article Pre-requisite for 11.2.0.1 to 11.2.0.2 ASM Rolling Upgrade Metalink Note : 1274629.1. Additionally ,Refer to “How to Manually Apply A Grid Infrastructure PSU Without Opatch Auto” Metalink Note 1210964.1.

Performing an In-place upgrade of a RAC DB from 11.2.0.1 to 11.2.0.2

In-order to upgrade a RAC Database from 11.2.0.1 to 11.2.0.2 you need to upgrade

1. The grid infrastructure first.
2. Then, the Oracle Database.

Environment Setup Details used in this post:
— 2 Node RAC Red Hat Linux 5.5 with RACK Servers (R710).
— Applies to any storage.

Latest OPatch
It is recommended to we use the Latest Version of OPatch. Unzip the zip file and copy OPatch folder to $ORACLE_HOME And $GI_HOME by renaming
the earlier OPatch directory. You can refer to how to download and Install OPatch Metalink ID 274526.1.

Pre- Requisite Patch:

First, let’s discuss about the mandatory patches required before upgrading to 11.2.0.2. Patch 9655006 is required in order for the upgrade
to succeed, if not rootupgrade.sh will fail.
Patch 9655006 is downloadable from http://www.metalink.oracle.com/ For information on Bug 9655006, refer to Metalink article ID 9655006.8
Download the patch and unzip it to a stage directory (it can be any directory), once you unzip the downloaded patch, 2 folders will be created. In this example I have unzipped the patch to /patches directory.

Now, let’s see how to patch the $GI_HOME with Patch 9655006.

Before the patch is installed, we need to perform a pre-req check. Let’s see how to do this.
1. [root@rac1 patches]# su – oracle

2. [oracle@rac1 ~]$ cd /opt/app/oracle/product/11.2.0/dbhome_1/OPatch/

3. [oracle@rac1 OPatch]$ ./opatch prereq CheckConflictAgainstOHWithDetail -phBaseDir /patches

Invoking OPatch 11.2.0.1.6
Oracle Interim Patch Installer version 11.2.0.1.6
Copyright (c) 2011, Oracle Corporation.  All rights reserved.
PREREQ session
Oracle Home       : /opt/app/oracle/product/11.2.0/dbhome_1
Central Inventory : /opt/app/oraInventory
 from           : /etc/oraInst.loc
OPatch version    : 11.2.0.1.6
OUI version       : 11.2.0.1.0
Log file location : /opt/app/oracle/product/11.2.0/dbhome_1/cfgtoollogs/opatch/opatch2011-08-16_19-36-09PM.log
Invoking prereq "checkconflictagainstohwithdetail"
Prereq "checkConflictAgainstOHWithDetail" passed.
OPatch succeeded.

Patching the $GI_HOME
1. Stop the Instance running on Node1
srvctl stop instance -d upgrade -i upgrade1

2. As root User run the opatch auto from the $GI_HOME
./opatch auto /patches

Note: The Opatch auto takes care of the patching of both the Grid infrastructure Home and the Oracle Home with the mandatory patch .
Once the patching is done on Node1, start the instance on Node1

3. Starting the Instance on Node1
srvctl start instance -d upgrade -i upgrade1

Repeat the process of pre-req and patching on Node2

Once the mandatory patch is applied, we can proceed with the upgrade of the grid infrastructure home.

Patches required:

The software/Patch can be downloaded from My Oracle support: patch 10098816. Select p10098816_112020_Linux-x86-64_3of7.zip
for grid infrastructure download. Once downloaded, unzip them.

Upgrading Grid Infrastructure:

Unzip the patches downloaded and invoke runInstaller from the unzipped grid folder. You will be taken to the welcome screen.

Choose Skip Software updates

Choose Upgrade Gird Infrastructure or Oracle ASM

Choose the Language

The Nodes present are selected by default, Click Next.

Leave the OS groups to Defaults

Choose the New Location where the Grid Infrastructure should be installed


The Pre-Requisite checks are performed, click next


The summary screen appears click next

Click on Install

Run rootupgrade.sh on both the nodes as specified in the screenshot

Upgrading the Database via In-place upgrade:

Patches Required:
The software/Patch can be downloaded from My Oracle support: patch 10098816.
Select p10098816_112020_Linux-x86-64_1of7.zip and p10098816_112020_Linux-x86-64_2of7.zip for database patch/software download.
Once downloaded, unzip them.

In-Place upgrades (Things to do before performing and In-place upgrade)

When performing an in-place upgrade, which uses the same Oracle home location, an error messages appears stating that the installer
detects Oracle Database software in the location that you specified.

Message: The installer has detected that the software location you have specified contains Oracle Database software release 11.2.0.1.
Oracle recommends that when upgrading to 11.2.0.2, you perform an out-of-place installation of the software into a new Oracle home and
then upgrade the database using the new software binaries.

Cause: The installer has detected that the software location you have specified contains Oracle Database software release 11.2.0.1.
Action: Either perform an in-place upgrade (Steps provided in this section), or perform an out-of-place upgrade

Performing an In-Place Upgrade for an Oracle RAC Database

To perform an in-place upgrade for Oracle RAC Database instances:
1. Back up the configuration data by backing up the following directories on all cluster nodes:
o ORACLE_HOME/dbs
o ORACLE_HOME/network/admin
o ORACLE_HOME/hostname_dbname
o ORACLE_HOME/oc4j/j2ee/OC4J_DBConsole_hostname_dbname

2. Run the following command on each of the nodes to detach the 11.2.0.1.0 Oracle RAC ORACLE_HOME:
$ORACLE_HOME/oui/bin/runInstaller -detachHome ORACLE_HOME=11.2.0.1.0 software location

3. Rename the 11.2.0.1.0 Oracle RAC ORACLE_HOME directory on all the nodes to a temporary name.

4. Install release 11.2.0.2 Software Only on all nodes:

From the unzipped folder, invoke the run Installer from the database folder

The welcome screen appears, uncheck the security updates and click next

Choose Skip Software Updates

Select Install Database Software Only and click next

Choose Oracle Real Application Clusters DB Installation and Select the Nodes and click next

Choose the Language and click next

Select Enterprise Edition and click next

Specify the location of the old home, and click next

Provide the Operating System groups and click next

Once the Pre-requisite checks are performed, click next

The summary screen appears, verify the settings and click next

The installation proceeds

Once the installation is done, run Root.sh on both the nodes as instructed and click ok.

Now, on all nodes, restore into the 11.2.0.2 ORACLE_HOME software location the backed up configuration data files
(from the backups you made of /dbs and network/admin), and also restore the following two directories:
/hostname_dbname and /oc4j/j2ee/OC4J_DBConsole_hostname_dbname. Specify the actual name for hostname_dbname.

Database Upgrade using DBUA:

Run DBUA from the 11.2.0.2 ORACLE_HOME/bin directory on the local node and select the 11.2.0.1.0 Oracle RAC database instance to
upgrade it to release 11.2.0.2.0.

The welcome screen appears once you invoke the DBUA, click next to proceed

DBUA Lists the databases that can be upgraded, select the one that you would like to upgrade

You can choose the Degree of parallelism and upgrading the time zone version and click next

Skip this screen by clicking next

The summary screen appears, click on finish for the upgrade to proceed.

The upgrade proceeds

NOTE: The only time one should use this method (in-place upgrade) is when you lack space on your server.
However, you can use this method for your test and development environments to save space.

Since this method was the traditional method of doing things,
Since this method is available, I thought it’s worth checking how it works in 11gR2.
During the entire upgrade process, I never ran into any issues, apart from the slightly higher downtime in
comparison to the out-of place upgrade. However, oracle doesn’t recommend this method. You can refer to the following metalink note 1291682.1.

BUG:10082277: Database Connections failing with ORA-4031

A short write-up on a problem faced in few newly upgraded databases to 11.2.0.1 :

I faced this issue in couple of databases which were recently upgraded to 11gR2, the problem is so severe that even connecting using “/ as sysdba”  is not working and erroring out with

<span style="font-family: arial,helvetica,sans-serif; font-size: small;">ORA-04031: unable to allocate 254 bytes of shared memory ("shared pool","unknown object","PCUR^bb2f222c","kkscsAddChildNodeToContext)</span>

The SGA is sized efficiently, SGA_TARGET=8GB and shared_pool_size=900MB. Looking at the trace file generated by this error, there were 4 subpools generated and the all the 4 subpools have enough “free momory” under them.

=======================
Memory Utilization of Subpool 1
=======================

Allocation Name  Size
______________ ____________
“free memory ”    413 721 528

=======================
Memory Utilization of Subpool 2
=======================
Allocation Name  Size
_________          ____________
“free memory ”    369 109 200

=======================
Memory Utilization of Subpool 3
=======================
Allocation Name     Size
______________    ____________
“free memory ”       1 035 124 136
“PCUR ” 4 955 061 232

=======================
Memory Utilization of Subpool 4
=======================
Allocation Name       Size
______________       ____________
“free memory ”          370 023 728

We can see that all the subpools have more than 300MB of memory as free, But in the “Subpool 3” we can see some unusual amount of memory allocated to heap area “PCUR”  4 955 061 232 i.e 5GB Approx.

Now its time to find where “PCUR’ is leaking memory??

Searching in Metalink for this showed a known BUG in 11.2.0.1 and 11.2.0.2 :

<span style="font-family: arial,helvetica,sans-serif; font-size: small;">Bug 10082277  Excessive allocation in PCUR heap of "kkscsAddChildNo" (ORA-4031)</span>

As per this BUG, memory type “”kkscsAddChildNo”” under the “perm” space of “PCUR” subheap is growing over the time and is not getting deallocated leading to this situation.  The solution is to apply this patch.

Oracle will be fixing this in version 12G.

BUG: Creating 11gR2 RAC Using VMware On Toshiba Laptop

I was setting up 11gR2 RAC grid infra on my new Toshiba L745 series laptop on OEL5.4, the installation was going very smoothly but suddenly the VMWare machines crashed while performing remote operations on second node with following message:

<span style="font-family: arial,helvetica,sans-serif;">"vmxaiomgr.retrycontabort" Please disconnect and reconnect if the storage is external to comupter</span>

The next thing was to look at the error file generated by vmware, and it states the following about this error:

<span style="font-family: arial,helvetica,sans-serif;">"Insufficient quota to complete the requested service (1453)" </span>

Though I had enough Diskspace, RAM availability, CPU etc. etc. it was hard to believe that I am running out of allocated resources.
Googling this error message, I saw THIS VMware Communities blog which explains the same error which I faced. It explains about a bug in Toshiba’s WiFi service “cfWiMAXService” i.e “ConfigFree WiMAX” which is causing all this and not allowing to perform the remote copy operation of Grid Infra install.

<span style="font-family: arial,helvetica,sans-serif;">The solution is to disable ConfigFree Service on Toshiba Laptop</span>

After disabling this service, the installation worked fine. Though I am not aware what will be the effect of disabling this serviceon laptop but the laptop is working fine since last two weeks.

Thanks to the above mentioned link which helped me solving this issue.

Performance Management Guide on AIX

While trying to find the amount of physical memory used by oracle process on AIX, I got reference of a document from Metalink:

Performance Management Guide

It tell us about which process is using how much memory and how to interpret the output of commands like: vmstat, svmon, ps on AIX.

Also, to get more information on AIX parameter like: MAXPERM, MINPERM click here

Though I have not explored the complete guide yet, but found it very good to start with.

ORA-01722 with Full Table Scan

My application developers approached me with an issue which is very unique to me. They were complaining about a query which was failing with ORA-01722 “invalid number” after an upgrade to 11.1.0.7 from 10.2.0.4. The syntax of the query is like:

select max(a) from t1 where c1<>'abc' and c2=12345 and c3='Y' and c4='xyz';

This query worked fine in 10204 and was also working fine in another, upgraded, 11.1.0.7 database.

All the columns i.e C1,C2,C3 & C4 are varchar 2(20) .

I ran this query with single quotes around column C2 as:

select max(a) from t1 where c1<>'abc' and c2='12345' and c3='Y' and c4='xyz';

and it worked fine but without single quotes it failed again with same error.

I checked the explain plan of the query and it was doing a “Full Table Scan” on Table T1. Then I opened another 11.1.0.7 database where the same query is working fine and found that there is an index on columns C1,C2,C3 & C4 and the table T1 was getting accessed by Index-Range scan.

Now coming back to the failing 11.1.0.7 database, index on column C4 was missing. After creating index on column C4 the query started to work fine at failing instance.

I am not sure how the absence of an index can cause this issue? Why VARCHAR2 cannot recognize a value without quotes when doing a Full Table Scan?

Your comments are always welcome. Please let us know your views on this.

Effect Of Multiple SHMMAX Settings

Last week I saw a warning message at database startup time saying:

WARNING: EINVAL creating segment of size 0x000000000f0020xx
fix shm parameters in /etc/system or equivalent 

It is an Oracle 10204 database running on Solaris.

Searching MOS for exact meaning for this warning, it states that a new shared memory segment is getting created to accommodate SGA.

As the message indicated, I opened /etc/system file to verify the settings of SHMMAX parameter and found the SHMMAX value to be 4GB. I stopped at this point and closed the /etc/system file. Then the next thing to check is the number of oracle instances running on the server and the size of largest SGA.

There were two instances running on the server and the largest SGA was set to 1.8G and the other SGA size was 700M.

This setting shows that there is no need to create additional shared memory segment. Then I checked the /etc/system file again, but this time I used the following command :

<span style="font-size: small;"><span style="font-family: arial,helvetica,sans-serif;">$ cat /etc/system | grep shmmax</span></span>
<span style="font-size: small;"><span style="font-family: arial,helvetica,sans-serif;">set shmsys:shminfo_shmmax=4000000000  ==&gt;4GB</span></span>
<span style="font-size: small;"><span style="font-family: arial,helvetica,sans-serif;">set shmsys:shminfo_shmmax=500000000  ==&gt;500M</span></span><span style="font-size: small;"><span style="font-family: arial,helvetica,sans-serif;">

There were two different values set for SHMMAX parameter.

The cause of the above warning message came out to be:
As the files are read from bottom-to-top, server was taking SHMMAX value as 500M and ignoring the 4GB value.

After commenting SHMMAX value of 500M, the warning message disappeared.