DBConsole Issue on RAC -Part I

Currently I am working on issue where DBConsole is not starting on our 2 Node RAC system. When I try to start, I get following errors

[[email protected] ~]$ emctl status dbconsole
TZ set to US/Pacific
Exception in getting local host
java.net.UnknownHostException: PROD01: PROD01
        at java.net.InetAddress.getLocalHost(InetAddress.java:1191)
        at oracle.sysman.emSDK.conf.TargetInstaller.getLocalHost
(TargetInstaller.java:5561)
        at oracle.sysman.emSDK.conf.TargetInstaller.main
(TargetInstaller.java:4126)
Exception in getting local host

I tried recreating the DBConsole but that also failed and gave following error

[[email protected] ~]$ emca -config dbcontrol db  -cluster

STARTED EMCA at Jun 12, 2008 3:29:40 AM
EM Configuration Assistant, Version 10.2.0.1.0 Production
Copyright (c) 2003, 2005, Oracle.  All rights reserved.

Jun 12, 2008 3:29:40 AM oracle.sysman.emcp.util.ClusterUtil getHostName
SEVERE: Error getting hostname for the cluster node PROD01. This node may not be configured correctly
Enter the following information:
Database unique name: testdb1
Jun 12, 2008 3:29:42 AM oracle.sysman.emcp.ParamsManager getInaccessibleNodeList
WARNING: The following cluster nodes are unavailable: [PROD01, PROD02].
Jun 12, 2008 3:29:42 AM oracle.sysman.emcp.ParamsManager getInaccessibleSidList
WARNING: The requested operation will not be performed for the following instances: [testdb11, testdb12].
No cluster nodes found when configuring the RAC database for EM

Above error informs that the nodes are not available, but if we check the status, they are indeed running.

[[email protected] ~]$ crs_stat -t
Name           Type           Target    State     Host
------------------------------------------------------------
ora.testdb1.db    application    ONLINE    ONLINE    PROD01
ora....omp1.cs application    ONLINE    ONLINE    PROD01
ora....11.inst application    ONLINE    ONLINE    PROD01
ora....12.inst application    ONLINE    ONLINE    PROD02
ora....SM1.asm application    ONLINE    ONLINE    PROD01
ora....01.lsnr application    ONLINE    ONLINE    PROD01
ora....d01.gsd application    ONLINE    ONLINE    PROD01
ora....d01.ons application    ONLINE    ONLINE    PROD01
ora....d01.vip application    ONLINE    ONLINE    PROD01
ora....dM2.asm application    ONLINE    ONLINE    PROD02
ora....02.lsnr application    ONLINE    ONLINE    PROD02
ora....d02.gsd application    ONLINE    ONLINE    PROD02
ora....d02.ons application    ONLINE    ONLINE    PROD02
ora....d02.vip application    ONLINE    ONLINE    PROD02

At this moment I searched Metalink for any known issues. I came across

Note.388440.1 - Problem Emca Fails To Configure DB Control For RAC Database Error Getting Hostname For The Cluster Node

According to this we need to confirm that SSH is set and output of "cluvfy comp nodecon -n all" command should return Sucess. In our case SSH was already set. So I tried using the command but it was Unsucessful

[[email protected] ~]$ cluvfy comp nodecon -n all

Verifying node connectivity

Verification of node connectivity was unsuccessful on all the nodes.

At this moment we created a SR with Oracle Support. We were asked to then check

Note 549667.1 - Cluvfy returns "Unsuccessful" for most commands, with no other details

We verified that this note was not applicable to us as file permissions for files (Discussed in Note 549667.1) were correctly set. Now we have one more SR which has been created with RAC team to resolve the “Cluvfy” issue.

It’s been long wait and despite of SR being Escalated, still haven’t got a response from Analyst. Will keep you all posted about the issue and will share the solution. Meanwhile if someone else has also faced this situation and resolved it, then do let me know.

8 thoughts on “DBConsole Issue on RAC -Part I

  1. Had same problem.
    Issue was caused by incorrect SSH set up.
    Make sure you can ssh (as oracle) between all your nodes, with no passphrases/passwords;
    Trick is that you also must be able to ssh to the node your on:
    i.e. from mcnode1, in a two node cluster, make sure you have password free ssh to:
    ssh mcnode1
    ssh mcnode2
    Similarly, do the same on mcnode2
    ssh mcnode2
    ssh mcnode1

    Also, make sure that (as root) hostname and domainname work. (probably not required but no harm)

    I did this, then stopped em
    emctl stop dbconsole
    emctl stop agent
    – stop agents on ALL nodes

    The re-install EM completely from mcnode1
    emca -config dbcontrol db -repos recreate -cluster

    Have done this twice on two clusters – worked fine on both – however on one of them, it used new port numbers – 1830 instead of 3938 for agent, 5500 instead of 1158 for dbconsole, – I search for targets.xml in $ORACLE_HOME, and changed any occurrence of 3938 to 1830 (note before changing the post numbers EM was working fine – but showing up extra targets – i.e. an 1830 target as well as a 3938 target)

  2. Thanks Jerry for your comment…In our case even correcting SSH did not solve the problem..I discuss this in http://askdba.org/weblog/?p=121 .. We were told by Oracle development team to disable ipv6 as they suspected that to be issue..But I was not able to implement the same as I moved from that site before we could get approval from management team for changing the ipv6 setting to ipv4..

    So anyone facing this issue, can also explore possibility of disabling ipv6 setting and enabling ipv4 setting..

  3. hi jerry,

    we also got the same reply from oracle team. they asked us to disable ipv6 and enable ipv4. but i requires downtime. so we cant implement it. amit did you checked the suggestion given by oracle development team??

  4. Hi Amit,

    Is is mandatory to stop any service or resource in RAC database , before we drop existing Database Control configuration using following command
    emca -x

    and recreating Database Control configuration , using following command ?

    emca -r -c | emca -r -a | emca -r -c -a

  5. Hi Jeegar,

    It is not necessary to stop any resource or service in RAC to drop existing DB control.
    The commands to drop the dbcontrol can be run without performing any additional action.

    Cheers!!!
    Saurabh Sood

  6. Hi,
    I know the last reply is a long time ago, but I got the error, when I try to verify my node connectivity with cluvfy. In my case the error was fired by a missing entry in my known_hosts file.
    When I try to connect from one node to another using ssh, ssh asks to me to put my fingerprint to the local file. This ssh question caused cluvfy to fail.

  7. at times the cluvfy might give unsuccessful in case if the banner is enabled. can you please disable the banner and check it too?

Leave a Reply