oracle 10g

Optimizer Choosing Nested-Loop Joins Instead of Hash-Joins

In one of my databases, one application query suddenly started to pick Nested-Loop joins instead of Hash-Joins and took almost 6 hours to complete which gets completed in less than 10 secs with Hash-Joins.
The same query in another similar database with same configuration and same data is doing fine and using hash joins. There is no difference in data/stats/OS/init parameter etc. (Though I know that no two databases are same)

About the query:
— It is a simple select statement which selects data from a complex view.
— The view comprises of 5 different tables, four of which have more than 15K rows and one have less     then 50 rows.
— Statistics are up-to-date in both databases.

I can see the optimizer behavior using 10053 event for the next run of this query but want to know what else can be checked to know why the plan changed suddenly, in this case, before using 10053 event.

Your valuable inputs on this!!!!

 

SRVCTL fails to start RAC resources:CRS-0215

After upgrading RAC database to 10204 and applying CRS bundle patch-1 for 10204 crs home,
srvctl command fails to startup resources on rac nodes. While starting up RAC resources using SRVCTL
following error occurs in CRSD.log file:

$ srvctl start instance -d rac -i rac2

2009-04-09 13:45:22.091: [  CRSRES][2611477408][ALERT]0`ora...inst` on member `` has experienced an unrecoverable failure.
2009-04-09 13:45:22.091: [  CRSRES][2611477408]0Human intervention required to resume its availability.
2009-04-09 13:46:25.162: [  CRSRES][2611477408]0StopResource: setting CLI values
2009-04-09 13:46:25.174: [  CRSRES][2611477408]0Attempting to stop `ora...inst` on member ``
2009-04-09 13:46:25.206: [  CRSAPP][2611477408]0StopResource error for ora...inst error code = 1

To debug SRVCTL SRVM_TRACE is set to true and a Strace is taken at OS level:

$script /tmp/srvm.log
$export SRVM_TRACE=TRUE
$srvctl start instance -d  -i
$exit

It will genertae a trace file at /tmp/srvm.log.

$ strace -aef -o /tmp/strace.log srvctl start instance -d -i

It will generate a trace file at /tmp/strace.log

— srvm.log shows follwoing error:

[Thread-2] [11:57:59:774] [StreamReader.run:65]  OUTPUT>Attempting to start `ora.rac.rac2.inst` on member `node11`
[Thread-2] [11:58:0:862] [StreamReader.run:65]  OUTPUT>`ora.rac.rac2.inst` on member `node11` has experienced an unrecoverable failure.
[Thread-2] [11:58:0:862] [StreamReader.run:65]  OUTPUT>Human intervention required to resume its availability.
[Thread-2] [11:58:0:863] [StreamReader.run:65]  OUTPUT>nloz11:ora.rac.rac2.inst:/oac/app/oracle/product/10.2.0/db_1/bin/racgwrap: line 62: fg: no job control
[Thread-3] [11:58:0:865] [StreamReader.run:65]  ERROR>CRS-0215: Resource ora.rac.rac2.inst cannot be started.
[Thread-3] [11:58:0:865] [StreamReader.run:65]  ERROR>
[Worker 0] [11:58:0:865] [RuntimeExec.runCommand:133]  runCommand: process returns 115

— strace.log file shows the following:

rt_sigprocmask(SIG_SETMASK, [], NULL, 8 ) = 0
rt_sigprocmask(SIG_BLOCK, [CHLD], [], 8 ) = 0
rt_sigprocmask(SIG_SETMASK, [], NULL, 8 ) = 0
rt_sigprocmask(SIG_BLOCK, [CHLD], [], 8) = 0
rt_sigaction(SIGINT, {0x8075d8b, [], SA_RESTORER, 0xb7ee5908}, {SIG_IGN}, 8 ) = 0
waitpid(-1, [{WIFEXITED(s) && WEXITSTATUS(s) == 2}], 0) = 18699
rt_sigprocmask(SIG_SETMASK, [], NULL, 8 ) = 0
--- SIGCHLD (Child exited) @ 0 (0) ---
waitpid(-1, 0xbfffe9bc, WNOHANG) = -1 ECHILD (No child processes)
sigreturn() = ? (mask now [])
rt_sigaction(SIGINT, {SIG_IGN}, {0x8075d8b, [], SA_RESTORER, 0xb7ee5908}, 8 ) = 0
rt_sigprocmask(SIG_BLOCK, NULL, [], 8 ) = 0
read(255, "exit $?\n", 6261) = 8
rt_sigprocmask(SIG_SETMASK, [], NULL, 8 ) = 0
exit_group(2) = ?

The SRVM trace showed that there is a problem with racgwrap script at line 62 which indicates the following:

$ORACLE_HOME/bin/racgmain “$@”

Could not found much with this line, but from the begning i.e line 1 the entry for ORACLE_HOME was missing.

ORACLE_HOME=<%ORACLE_HOME%>
export ORACLE_HOME
— Added the correct oracle_home location at this place.

Also, after checking the srvctl file for the db_home the “OHOME” and “CHOME” entries were missing:
— Added the correct entries for OHOME and CHOME ( copied the entries from the node where srvctl was working fine)

After making these two changes SRVCTL worked fine.

Cheers!!!!
Saurabh Sood

Connections to DataBase Hang Including “/ as sysdba”

Recently I faced one issue where all the connection to database hung and it was also not possible to login to database using “/ as sysdba”.
To get access of sqlplus I used the following syntax:

$ sqlplus -prelim / as sysdba

With “prelim” option we can run some commands which will help in collection useful information about the problem.

This will work only in Oracle 10g and higher version.

After successfully getting connected run the following commands to generate Hanganalyze and systemstate traces:

SQL> oradebug setmypid

SQL> oradebug unlimit

SQL> oradebug dump systemstate 266

SQL> oradebug tracefile_name

— This will give you the name of the tracefile generated.

SQL > oradebug dump hanganalyze 2

SQL > oradebug tracefile_name

To analyze these trace files one should be aware of Metalink Note: 215858.1.

After analyzing these files I found that following event was active and causing the hang:

<span style="font-family: arial,helvetica,sans-serif;"><span style="font-size: small;">"resmgr:cpu quantum"
Cmd: PL/SQL Execute

It means that the sessions are waiting for their turn on CPU.

This event occurs when resource manage is active and controls the allocation of CPU to processes.

We can also see the command which is causing all this: i.e some PL/SQL code was executing and spnning on for CPU.

After finding out this, checked with “TOP” command, got the PID of the process consuming all the cpu and killed that process with “kill -9”

After killing that process the users were able to connect.

So the cause of the Hang was found i.e PL/SQL, but it is still unknown why  PL/SQL caused problems. 🙂

Cheers!!!

Saurabh Sood

Creating Oracle Extended RAC on Oracle VM

Yesterday, I found one very useful article at OTN “Creating Oracle Extended RAC” on completely virtual environment using Oracle VM. As Virtualization is becoming popular day by day and is very cost effective, one must know how to use this to simulate actual environments. Click  here for details on Oracle Extended RAC on Oracle VM.