V$SQL_SHARED_CURSOR in 11.2.0.2

Oracle 11.2.0.2 has introduced new column “REASON” to V$SQL_SHARED_CURSORS. I came across this extremely useful column while debugging multiple child cursor issue (11.2.0.2 seems to have too many bugs which can cause multiple child cursors).

As per documentation , REASON column has been introduced in 11.2.0.2

REASON - CLOB	Child number, id, and reason the cursor is not shared. The content of this column is structured using XML.

Reason column is helpful as we can easily interpret the reason why child cursors are not being shared. e.g

<ChildNode>
<ChildNumber>151</ChildNumber>
<ID>9</ID>
<reason>PQ Slave mismatch(5)</reason>
<size>2x4</size>
<ctxpq_StmtId_pqctx>0</ctxpq_StmtId_pqctx>
<fxu_kxfxutqidb>2422463</fxu_kxfxutqidb>
</ChildNode>
<ChildNode><ChildNumber>151</ChildNumber>
<ID>3</ID>
<reason>Optimizer mismatch(2)</reason>
<size>4x4</size>
<parallel_query_default_dop>16</parallel_query_default_dop>
<kxfr_Default_DOP>80</kxfr_Default_DOP>
<isParallel>1</isParallel>
<Need_Default_Dop>0</Need_Default_Dop>
</ChildNode>
<ChildNode>
<ChildNumber>151</ChildNumber>
<ID>9</ID><reason>PQ Slave mismatch(5)</reason><size>2x4</size><ctxpq_StmtId_pqctx
>0</ctxpq_StmtId_pqctx><fxu_kxfxutqidb>1433944</fxu_kxfxutqidb></ChildNode>

We can verify the same thing by checking PQ_SLAVE_MISMATCH column

select sql_id,PQ_SLAVE_MISMATCH,OPTIMIZER_MODE_MISMATCH from  V$SQL_SHARED_CURSOR where sql_id='b9uz0akdcx58q' and child_number=151;

SQL_ID	      P O
------------- - -
b9uz0akdcx58q Y N

There is one more column PURGED_CURSOR which is new (I guess it came from 11.2.0.1) which tells if the cursor has been purged using dbms_shared_pool.purge package.

11gR1 CRS start failing with ORA-29702

This post was actually a comment from Sasi which came on previous article 10.2 CRS startup issue . I am converting it to a post so that we can get some feedback on this issue from other users. I suspect this to be caused by RHEL5 bug (fixed in RHEL5u6) related to NIC going down when multiple interface cards are being used. 

We had a similar error but the problem was different and thought of sharing it here.

We recently installed 11gR1 two node RAC and all was fine till last week and suddenly we saw the same error “ORA-29702: error occurred in Cluster Group Service operation”. Crs was not starting . some of the crs process were running and it was refusing to stop.

root@node1> /u01/app/crs/bin/crsctl stop crs
Stopping resources.
This could take several minutes.
Error while stopping resources. Possible cause: CRSD is down.
Stopping Cluster Synchronization Services.
Unable to communicate with the Cluster Synchronization Services daemon.

ASM alert log and database alert log had the below to say

ASM Alert Log:

Errors in file /u02/app/asm/diag/asm/+asm/+ASM1/trace/+ASM1_lmon_2185.trc:
ORA-29702: error occurred in Cluster Group Service operation
LMON (ospid: 2185): terminating the instance due to error 29702
Mon Nov 22 20:02:16 2011
ORA-1092 : opitsk aborting process

Oracle database Alert Log:

ERROR: LMON (ospid: 3721) detects hung instances during IMR reconfiguration
Tue Nov 22 22:10:37 2011
Error: KGXGN polling error (16)
Errors in file /u03/app/oracle/diag/rdbms/ccbdrpd/ccbdrpd1/trace/ccbdrpd1_lmon_3721.trc:
ORA-29702: error occurred in Cluster Group Service operation
LMON (ospid: 3721): terminating the instance due to error 29702

Not much info in the trace files.

Looked at metalink note : Diagnosing ‘ORA-29702: error occurred in Cluster Group Service operation’ [ID 848622.1]
But the problems mentioned in it were not applicable to our site.

Looked at CRS alert log, CRSD logs and CSSD logs, there were heaps of information but not quite useful to nail down the issue. Could not see any error messages

Also, looked at

RAC ASM instances crash with ORA-29702 when multiple ASM instances start(ed) [ID 733262.1]

There it was mentioned, when using multiple NIC for cluster interconnect and if they are not bonded properly it could cause issues and that could be seen in the alert logs.

In our case NIC bonding was done properly. We have configured and bonded as below
• eth0 and eth1 bonded as bond0 – for public and
• eth2 and eth3 bonded as bond1 – for cluster interconnect

and alert log showed they were configured fine.

Interface type 1 bond1 192.xxx.x.x configured from OCR for use as a cluster interconnect
Interface type 1 bond0 xx.x.x.x configured from OCR for use as a public interface

If NIC bonding not done properly then you would see multiple entries for cluser interconnect in the alert log.

Well,though this was not the issue in our case but it gave me a lead to identify the root cause of the problem. As it was mentioned about bonding I wanted to check both channel bonding interface (ifcdfg-bond0 & ifcfg-bond1) and Ethernet interface configurations (ifcfg-eth0, ifcfg-eth1, ifcfg-eth2 & ifcfg-eth3)

Well, all configuration files were good except for ifcfg-bond1 file and the entries were as below,

root@node1>cat ifcfg-bond1

DEVICE=bond1
IPADDR=xxx.xxx.xx.x
NETMASK=255.xxx.x.x
USERCTL=no
BOOTPROTO=none
ONBOOT=yes
TYPE=ethernet

On the 1st look they seem to be fine but when compared to ifcfg-bond0 the problem was obvious. Ifcfg-bond0 entries were as below,

root@node1> cat ifcfg-bond0
DEVICE=bond0
BOOTPROTO=none
ONBOOT=yes
NETMASK=255.xxx.x.x
IPADDR=xx.x.x.x
GATEWAY=xx.x.x.x
USERCTL=no
TYPE=BOND

If you look at line entry TYPE it’s mentioned as “TYPE=ethernet” in Ifcfg-bond1 and “TYPE=BOND” In Ifcfg-bond0.

Bingo…changed the configuration file and rebooted the server and all components came up fine. CRS, ASM and DB started and working fine.

But trying to find out why it worked fine during the installation and then stopped working suddenly.

AIOUG 2011 – I am Attending :)

All India Oracle User Group(AIOUG) is organizing the annual Oracle Conference in Bangalore on Friday, December 9th and Saturday December 10th, 2011.Arup Nanda is key speaker this year.

I will be attending this conference for first time and I am looking forward to meeting fellow bloggers/DBA’s. See you all there

You can find the registration link at http://www.aioug.org/sangam11.php

Changing CRS/Database timezone in 11.2.0.2 post install

I had installed a 11.2.0.2 RAC setup few days back with incorrect timezone. It had to be PDT but I installed with UTC.
Starting/stopping clusteware with correct timezone didn’t solve the issue.

In 11.2.0.2 Oracle stores timezone information in file $GRID_HOME/crs/install/s_config_(hostname).txt. In my case file looked like this

cd /oragrid/product/11.2/crs/install
cat s_crsconfig_prod1.txt
<strong>TZ=UTC</strong>
NLS_LANG=AMERICAN_AMERICA.AL32UTF8
TNS_ADMIN=
ORACLE_BASE=

To resolve the issue we need to change TZ to US/Pacific on all nodes and restart clusterware. So entry would be like

<strong>TZ=US/Pacific</strong>

On Restarting clusteware , database and clusteware starts with correct timezone.

In case you wish to have different database timezone only for Oracle database, then it is possible using srvctl command. E.g

srvctl setenv database -d orcl -t TZ=US/Pacific

You can confirm this with getenv command

[oracle@prod1]~% srvctl getenv database -d orcl
orcl:
TZ=US/Pacific

This would require database bounce. Also note that in case database is started manually it would not start with correct timezone. To unset the parameter use following command

[oracle@prod1]~% srvctl unsetenv database -d orcl -t TZ
[oracle@prod1]~% srvctl getenv database -d orcl
orcl:

Hope this helps

11.2.0.3 Patchset is available

11.2.0.3 patchset is now available for download.  10404530 is available on Linux x86,x86-64 bit only. Tanel Poder first tweeted this

[blackbirdpie id=”117213822904840193″]

Soon after this Surachart and Martin have started upgrading to 11.2.0.3. I expect a blog post soon from one of them 🙂

[blackbirdpie id=”117242670719574016″]

[blackbirdpie id=”117245101268410369″]

Oracle Database Appliance Introduced

Oracle has introduced it new product: ODA (Oracle Database Appliance)

I was not able to join Oracle President Mark Hurd’s webcast where he supposed to announce a new Oracle product[It was at 1AM my time and I was sleeping], now viewing this webcast HERE

Kerry Osborne has shared information about this product, click link to see it, here you will see some good information about ODA.

Another new thing to talk about 🙂