Amit Bansal

11.1.0.6 ASM installation on Solaris fails -I

Looks like that there is no easy way for me to install/configure Oracle components. And every install/configuration leads me to some or the other problems. Anyways I was trying to install Oracle Database 11g on a Solaris 10 but the CSS does not seems to come up. While issuing localconfig add as root user it gives following message and fails

-bash-3.00# ./localconfig add
Successfully accumulated necessary OCR keys.
Creating OCR keys for user 'root', privgrp 'root'..
Operation successful.
Configuration for local CSS has been initialized

Cleaning up Network socket directories
Setting up Network socket directories
Adding to inittab
Startup will be queued to init within 30 seconds.
Checking the status of new Oracle init process...
Expecting the CRS daemons to be up within 600 seconds.

Giving up: Oracle CSS stack appears NOT to be running.
Oracle CSS service would not start as installed
Automatic Storage Management(ASM) cannot be used until Oracle CSS service is started

On checking further for css log messages in $ORACLE_HOME/log/hostname/client , I found following messages

Oracle Database 11g CRS Release 11.1.0.6.0 - Production Copyright 1996, 2007 Oracle. All rights reserved.
2008-06-18 21:38:29.721: [ CSSCLNT][1]clsssInitNative: failed to connect to (ADDRESS=(PROTOCOL=ipc)(<strong>KEY=OCSSD_LL_test1zone2_</strong>))
, rc 9

On checking the /var/adm/messages file , found following errors

Jun 18 01:16:23 test1zone2 root: [ID 702911 user.error] Oracle Cluster Synchronization Service starting by user request.
Jun 18 01:16:25 test1zone2 root: [ID 702911 user.error] Cluster Ready Services completed waiting on dependencies.
Jun 18 01:16:35 test1zone2 last message repeated 9 times

/var/tmp/.oracle does not show any files created after running localconfig add. These are called Socket files and CSS uses these for communication.I suspect this to be a issue.
I checked our Linux machine (with RAC) and found that we have some files and one of them is named as

srwxrwxrwx  1 oracle oinstall 0 Jun 11 22:10 <strong>sOCSSD_LL_prod01_</strong>

This is quite similar to the error message in css.log files i.e KEY=OCSSD_LL_test1zone2_

As this is a test machine so I cant raise a ticket with Oracle. Currently posted a thread on Oracle Forum. Let’s see if someone is able to figure out something. I will keep you posted. If anyone is interested in reading how these sockets work then they can visit this article from Frits Hoogland.

We are here to Stay !!

I had actually planned to write this post last week. It’s our Third web address till date with first being http://askoracledba.blogspot.com and second one being http://askoracledba.wordpress.com . Now we are on http://askdba.org/weblog/ and hope to continue this.

We decided to buy our own domain so that we are not dependent on Free service providers like Blogger, WordPress.com. Both of them have their advantages/disadvantages though WordPress Blogs score more over blogspot. This can also be seen by the number of users moving towards WordPress. We chose for independent domain as it gives us all advantages of WordPress with additional functionalities like option to choose Design templates/CSS editing and also display Ads. Anyways, from now on we will be posting on this site. Hope we can dish out interesting and helpful articles!! You can also visit our Homepage (currently it is not 100% complete). You will soon see articles/presentations related to Oracle and SQL Server there. You can also ask any questions or clarify any doubts on our Forum

[poll id=”2″]

DBConsole Issue on RAC -Part II

Continuing with DBConsole issue, we were able to get the cluvfy return success for the nodes.

Basically Oracle suggested us that when we use ssh we should not get any banner. E.g

[oracle@PROD01 ~]$ ssh PROD02 date
Fri Jun 13 02:00:41 IST 2008

But in our case it was displaying a banner which displayed a warning message when someone logged in Server.

As we are using Linux, we renamed file /etc/issue.net to something else and tried running cluvfy again. It was successful this time

[oracle@PROD01 ~]$ cluvfy comp nodecon -n all

Verifying node connectivity

Checking node connectivity...

Node connectivity check passed for subnet "10.X.X.X" with node(s) PROD02,PROD01.
Node connectivity check passed for subnet "192.X.X.X" with node(s) PROD02,PROD01

Interfaces found on subnet "192.X.X.X" that are likely candidates for VIP:
PROD02 eth3:192.X.X.X
PROD01 eth3:192.X.X.X

Interfaces found on subnet "10.X.X.X" that are likely candidates for a private interconnect:
PROD02 eth2:10.X.X.X eth2:10.X.X.X
PROD01 eth2:10.X.X.X eth2:10.X.X.X

Node connectivity check passed.


Verification of node connectivity was successful.

But the DBConsole issue still remains. It is still unable to find the hostname. Now waiting for Oracle 🙁

DBConsole Issue on RAC -Part I

Currently I am working on issue where DBConsole is not starting on our 2 Node RAC system. When I try to start, I get following errors

[oracle@PROD01 ~]$ emctl status dbconsole
TZ set to US/Pacific
Exception in getting local host
java.net.UnknownHostException: PROD01: PROD01
        at java.net.InetAddress.getLocalHost(InetAddress.java:1191)
        at oracle.sysman.emSDK.conf.TargetInstaller.getLocalHost
(TargetInstaller.java:5561)
        at oracle.sysman.emSDK.conf.TargetInstaller.main
(TargetInstaller.java:4126)
Exception in getting local host

I tried recreating the DBConsole but that also failed and gave following error

[oracle@PROD01 ~]$ emca -config dbcontrol db  -cluster

STARTED EMCA at Jun 12, 2008 3:29:40 AM
EM Configuration Assistant, Version 10.2.0.1.0 Production
Copyright (c) 2003, 2005, Oracle.  All rights reserved.

Jun 12, 2008 3:29:40 AM oracle.sysman.emcp.util.ClusterUtil getHostName
SEVERE: Error getting hostname for the cluster node PROD01. This node may not be configured correctly
Enter the following information:
Database unique name: testdb1
Jun 12, 2008 3:29:42 AM oracle.sysman.emcp.ParamsManager getInaccessibleNodeList
WARNING: The following cluster nodes are unavailable: [PROD01, PROD02].
Jun 12, 2008 3:29:42 AM oracle.sysman.emcp.ParamsManager getInaccessibleSidList
WARNING: The requested operation will not be performed for the following instances: [testdb11, testdb12].
No cluster nodes found when configuring the RAC database for EM

Above error informs that the nodes are not available, but if we check the status, they are indeed running.

[oracle@PROD01 ~]$ crs_stat -t
Name           Type           Target    State     Host
------------------------------------------------------------
ora.testdb1.db    application    ONLINE    ONLINE    PROD01
ora....omp1.cs application    ONLINE    ONLINE    PROD01
ora....11.inst application    ONLINE    ONLINE    PROD01
ora....12.inst application    ONLINE    ONLINE    PROD02
ora....SM1.asm application    ONLINE    ONLINE    PROD01
ora....01.lsnr application    ONLINE    ONLINE    PROD01
ora....d01.gsd application    ONLINE    ONLINE    PROD01
ora....d01.ons application    ONLINE    ONLINE    PROD01
ora....d01.vip application    ONLINE    ONLINE    PROD01
ora....dM2.asm application    ONLINE    ONLINE    PROD02
ora....02.lsnr application    ONLINE    ONLINE    PROD02
ora....d02.gsd application    ONLINE    ONLINE    PROD02
ora....d02.ons application    ONLINE    ONLINE    PROD02
ora....d02.vip application    ONLINE    ONLINE    PROD02

At this moment I searched Metalink for any known issues. I came across

Note.388440.1 – Problem Emca Fails To Configure DB Control For RAC Database Error Getting Hostname For The Cluster Node

According to this we need to confirm that SSH is set and output of “cluvfy comp nodecon -n all” command should return Sucess. In our case SSH was already set. So I tried using the command but it was Unsucessful

[oracle@PROD01 ~]$ cluvfy comp nodecon -n all

Verifying node connectivity

Verification of node connectivity was unsuccessful on all the nodes.

At this moment we created a SR with Oracle Support. We were asked to then check

Note 549667.1 – Cluvfy returns “Unsuccessful” for most commands, with no other details

We verified that this note was not applicable to us as file permissions for files (Discussed in Note 549667.1) were correctly set. Now we have one more SR which has been created with RAC team to resolve the “Cluvfy” issue.

It’s been long wait and despite of SR being Escalated, still haven’t got a response from Analyst. Will keep you all posted about the issue and will share the solution. Meanwhile if someone else has also faced this situation and resolved it, then do let me know.

Easy and Free X-Windowing Setup with Cygwin

Today I had to carry out Oracle client Installation on one of the server. Normally I use vncviewer to connect to one of my Linux machine (having vncserver setup) and then get the display on to that machine.

But my colleague suggested me about one more approach, which I thought of sharing with all of you. This is using a X –Windowing software which is Free 🙂 Surprised!!

This can be done using Cygwin which is a freeware so you can trash your Reflection X or Hummingbird Installations. You can check various X-Windows software available here

Lets come back to Cygwin. You can download the software from http://x.cygwin.com/

To perform the Cygwin setup, you need to follow below steps

1) Start the XWindows server on your PC. This could be done by running startxwin-multiwindow file present on Desktop (Created during installation) or running /usr/X11R6/bin/startxwin.sh from cygwin

$ sh /usr/X11R6/bin/startxwin.sh

You can also use startx. For configuration check this link

http://x.cygwin.com/docs/ug/using.html

3) Connect to the Unix/Linux machine and set the display to your machine. Use ipconfig command to get the IP address for your machine. E.g

C:\Documents and Settings\amit &gt;ipconfig
Windows IP Configuration
Ethernet adapter Local Area Connection:
        Connection-specific DNS Suffix  . : xyz.com
        IP Address. . . . . . . . . . . . : <strong>192.168.4.47</strong>
        Subnet Mask . . . . . . . . . . . : X.X.X.X
        Default Gateway . . . . . . . . . : X.X.X.X

Now set the DISPLAY variable as

$export DISPLAY=192.168.4.47:0.0

4) Test it by typing any application like xcalc,xclock,etc.

If X11 Forwarding is enabled on the remote machine, you can also directly login to server using ssh

$ssh oracle@ip –X

Now you can carry out oracle Installation using runInstaller 🙂

Transparent Application Failover – TAF

Transparent Application Failover (TAF) is a client-side feature that allows for clients to reconnect to surviving databases in the event of a failure of a database instance.

Note that this is not used for purpose of load balancing or for Connect Time Failover.

 

TAF operates in two modes

 – Session Failover which will recreate lost connections and sessions

– Select Failover which will replay queries that were in progress. It will discard the old rows which have been fetched earlier and will fetch the rest of rows.

 TAF can be implemented at

 – Client Side

Server Side

 Client Side

 This can be done by creating a entry in tnsnames.ora as follows

 

TESTDB10_basic=
(DESCRIPTION=
<strong>(LOAD_BALANCE=on)
 (FAILOVER=on) </strong>
(ADDRESS= (PROTOCOL=tcp) HOST=prod01-vip)(PORT=1521))
(ADDRESS= (PROTOCOL=tcp)(HOST= prod02-vip)PORT=1521))
(CONNECT_DATA=
 (SERVICE_NAME=TESTDB10)
<strong>(FAILOVER_MODE= (TYPE=select)
(METHOD=basic</strong>))))

Here FAILOVER_MODE parameter is used to implement TAF.

Above configuration is for connections which will allow new connections to be created at failover time. This is determined by parameter METHOD which is set to BASIC here.

You can also have a session established before. This helps as new connection can take time and will thus result in faster failover. To do this use

TESTDB_prod01=
(DESCRIPTION=
(LOAD_BALANCE=on)
(FAILOVER=on)
(ADDRESS= (PROTOCOL=tcp) HOST=prod01-vip)(PORT=1521))
(CONNECT_DATA=
(SERVICE_NAME=TESTDB)
(INSTANCE_NAME=TESTDB1)
(FAILOVER_MODE= <strong>(BACKUP= TESTDB_prod02)
(TYPE=select) (METHOD=PRECONNECT)</strong>)))

 TESTDB_prod02=
(DESCRIPTION=
(LOAD_BALANCE=on)
 (FAILOVER=on)
 (ADDRESS= (PROTOCOL=tcp) HOST=prod01-vip)(PORT=1521))
 (CONNECT_DATA=
 (SERVICE_NAME=TESTDB10)
(INSTANCE_NAME=TESTDB2)
 (FAILOVER_MODE=
 <strong>(BACKUP= TESTDB_prod01)
(TYPE=select)
(METHOD=PRECONNECT)</strong>)))

Following query can be used to monitor information about failed over sessions

SELECT MACHINE, FAILOVER_TYPE, FAILOVER_METHOD, FAILED_OVER, COUNT(*)
FROM V$SESSION GROUP BY MACHINE, FAILOVER_TYPE, FAILOVER_METHOD, FAILED_OVER;

Server Side Configuration

 

This can be done using server-side service attributes. Please note that in case both (Client side and server side) configuration are used then service-side settings will be used.

 

You can refer to following note for details

 

Note 404644.1 – Configuration of Transparent Application Failover(TAF) works with server side service

Features of TAF

 – Sessions executing any insert/update/delete statement fail then statement will be rollback.

A command which has been successfully completed upon failure and has changed the database state, then TAF does not resend the command.

When using select failover, we will retrieve only the rows which have not been fetched by earlier sessions.