Amit Bansal

11G: OCR Manual Backups

OCR is a file that manages cluster and Oracle Real Application Clusters (Oracle RAC) database configuration information. The OCR contains information about the cluster node list, instance-to-node mapping information, and information about Oracle Clusterware resource profiles for applications

This is a Very important component along with Voting Disk and Loss of it will require you to re-install the Clusterware Software.

Oracle 10g provides Automatic Backup of OCR along with manually export of OCR files.

Starting from Oracle 11g, we can also take manual backups of OCR. So in 11g , we can have following backups

1) Automatic backups—Oracle Clusterware automatically creates OCR backups every four hours. At any one time, Oracle always retains the last three backup copies of the OCR. The CRSD process that creates the backups also creates and retains an OCR backup for each full day and at the end of each week. You cannot customize the backup frequencies or the number of files that Oracle retains.

We can view the backups using ocrconfig -showbackup auto command.

[oracle@prod01]/home/oracle>ocrconfig -showbackup auto
prod01     2008/05/14 19:23:22     /u01/app/11.1.0/crs/cdata/prod_cluster/backup00.ocr
prod01     2008/05/14 15:23:21     /u01/app/11.1.0/crs/cdata/prod_cluster/backup01.ocr
prod01     2008/05/14 11:23:21     /u01/app/11.1.0/crs/cdata/prod_cluster/backup02.ocr
prod01     2008/05/13 15:23:20     /u01/app/11.1.0/crs/cdata/prod_cluster/day.ocr
prod01     2008/05/04 01:52:33     /u01/app/11.1.0/crs/cdata/prod_cluster/week.ocr

2)Manual backups—You can use the ocrconfig -manualbackup command to force Oracle Clusterware to perform a backup of the OCR at any time, rather than wait for the automatic backup that occurs at 4-hour intervals. The -manualbackup option is especially useful when you to need to obtain a binary backup on demand, such as before you make changes to the OCR.

You need to be logged in as root user to take manual backups. Use ocrconfig –manualbackup command. To view the backups use ocrconfig –showbackup manual command


[oracle@prod01]/home/oracle>ocrconfig -showbackup manual
prod01     2008/05/01 12:21:29     /u01/app/11.1.0/crs/cdata/prod_cluster/backup_20080501_122129.ocr
prod02     2008/04/11 11:32:50     /u01/app/11.1.0/crs/cdata/prod_cluster/backup_20080411_113250.ocr
prod02     2008/03/27 17:25:23     /u01/app/11.1.0/crs/cdata/prod_cluster/backup_20080327_172523.ocr

3)Manual Export—You can Use the ocrconfig -export file_name command to export the OCR content to a file format.

For more details refer to 11g Documentation

ORA-07445 [opidsa()+480] – Metalink Alert for 10.2.0.3

Oracle has reported a new Alert for user’s who have upgraded their databases to  10.2.0.3 patchset on windows.

User processes dumps core dumps with following errors

ORA-07445: exception encountered: core dump [<strong>opidsa()</strong>+480]
[SIGSEGV] [Address not mapped to object] [0x000000000] [] []

or

ORA-07445: exception encountered: core dump [ACCESS_VIOLATION]
 [_opidsa+360] [PC:0x2080540] [ADDR:0x0] [UNABLE_TO_READ] []
 

Patch 5648872 can be applied to resolve the issue. For more information, refer to
Metalink Note:418531.1-ORA-7445[opidsa] after Applying the 10.2.0.3 Patch Set

This issue is fixed in 10.2.0.4 Patchset 

Significance of LD_LIBRARY_PATH

Few days back, I wanted to remove a instance semaphore as they were not cleared automatically due to abnormal shutdown. As there were few other databases running on the server , so I thought of capturing the semaphore id to remove them manually using “ipcrm” .
For this purpose, we are required to use sysresv utility. Upon execution of the command, I received following errors

$/u01/app/oracle/product/10.2.0/bin>sysresv
sysresv: error while loading shared libraries: libclntsh.so.10.1: cannot open shared object file: No such file or directory

Now this was a new installation so I was bit scared if there was something wrong with the installation. Searching the web pointed me to articles which mentioned about using user which is not in correct group.
But in my case it was properly set and part of dba/oinstall group. Further research led me to Puschitz webpage (Very useful site if you are planning to install Database)which described that this problem is caused when environment variable LD_LIBRARY_PATH is not set.
So I set the value of LD_LIBRARY_PATH and then tried again

<strong>$</strong>export LD_LIBRARY_PATH=$ORACLE_HOME/lib
<strong>$</strong>sysresv
IPC Resources for ORACLE_SID "orcl10" :
Shared Memory:
ID KEY
131074 0x0291b1b8
Semaphores:
ID KEY
229377 0xc116e83c
262146 0xc116e83d
294915 0xc116e83e
Oracle Instance alive for sid "orcl10"

This worked !!!
My issue was solved but I must admit that I was not aware of what this variable did. I had seen this Environment variable set in my previous company, but somehow I didn’t give importance to it.
Searching on web, I came across Oracle Forum thread
which stated that
“The LD_LIBRARY_PATH environment variable contains a colon-separated list of directories in which the linker will search for library (.a) files. If you want the linker to search additional libraries, you can add their names to LD_LIBRARY_PATH.”

One more article gave insight into how application looks for library files.

http://docsun.cites.uiuc.edu/sun_docs/C/solaris_9/SUNWdev/LLM/p4.html
The runtime linker only looks in one default location for dependencies. This location is /usr/lib when processing 32-bit objects, and /usr/lib/64 when processing 64-bit objects. All other directories to be searched must be added to the runtime linker’s search path explicitly.
On searching metalink for further information on this parameter, I came across

Subject: Solaris: Setting LD_LIBRARY_PATH in 10gR1 and higher
Doc ID: Note:271451.1

According to this note, there are two more variables LD_LIBRARY_PATH_32 and LD_LIBRARY_PATH_64 which can be set to point to 32-Bit and 64-Bit libraries.
Further it mentioned that it is Not Recommended to set LD_LIBRARY_PATH unless application (Oracle or 3rd-party) requires it and recommended using LD_LIBRARY_PATH_32 and LD_LIBRARY_PATH_64.
This is done so that 32-Bit programs can use the correct libraries and similarly 64-bit programs can use 64-Bit library files.

So next time, you receive some library error, then do not panic 🙂 and see if parameter LD_LIBRARY_PATH or LD_LIBRARY_PATH_32 or LD_LIBRARY_PATH_64 is set.

Note:- In case you are using non-oracle user and see that you are getting same error after setting the variable,then check if the file ibclntsh.so.10.1exists and non-oracle user’s have read/execute permission over it. If no then relink Oracle using relink all command

ASM Disk Discovery

While creating ASM diskgroup or adding new disk to a existing diskgroup, Disk should be visible in V$ASM_DISK.

ASM discovers and examines the contents of all of the disks that are in the paths that you designated with values in the ASM_DISKSTRING initialization parameter.

As per Oracle Docs, Disk discovery also occurs when you:

– Run the ALTER DISKGROUP…ADD DISK and ALTER DISKGROUP…RESIZE DISK commands

– Query the V$ASM_DISKGROUP and V$ASM_DISK views

Note: – You should try to use V$ASM_DISK_STAT to get faster results as access to this view does not lead to Disk Discovery.

While creating ASM Diskgroup, Disk Discovery is the most common issue. This is mostly due to Disk permission issue or incorrect setting for ASM_DISKSTRING parameter.

So as to check this , you can use KFOD utility provided by Oracle.

This can be found in $ASM_HOME/bin directory. Help can be seen using

 oracle@asm]/home/oracle&gt; kfod help=y

<span style="font-size:small;"><span style="font-family:Times New Roman;">_asm_a/llow_only_raw_disks<span>              </span>KFOD allow only raw devices [_asm_allow_</span></span>

<span style="font-size:small;"><span style="font-family:Times New Roman;">only_raw_disks=TRUE/(FALSE)]</span></span>

<span style="font-size:small;"><span style="font-family:Times New Roman;">_asm_l/ibraries<span>         </span>ASM Libraries[_asm_libraries='lib1','lib2',...]</span></span>

<span style="font-size:small;"><span style="font-family:Times New Roman;">_asms/id<span>                </span>ASM Instance[_asmsid=sid]</span></span>

<span style="font-size:small;"><span style="font-family:Times New Roman;">a/sm_diskstring<span>         </span>ASM Diskstring [asm_diskstring='discoverystring', 'disco</span></span><span style="font-size:small;"><span style="font-family:Times New Roman;">verystring' ...]</span></span>

<span style="font-size:small;"><span style="font-family:Times New Roman;">d/isks<span>          </span>Disks to discover [disks=raw,asm,all]</span></span>

<span style="font-size:small;"><span style="font-family:Times New Roman;">g/roup<span>          </span>Group discover [group=controlfile]</span></span>

<span style="font-size:small;"><span style="font-family:Times New Roman;">n/ohdr<span>          </span>KFOD header suppression [nohdr=TRUE/(FALSE)]</span></span>

<span style="font-size:small;"><span style="font-family:Times New Roman;">o/p<span>             </span>KFOD options type [OP=DISKS/GROUPS/ALL]</span></span>

<span style="font-size:small;"><span style="font-family:Times New Roman;">p/file<span>          </span>ASM parameter file [pfile='parameterfile']</span></span>

<span style="font-size:small;"><span style="font-family:Times New Roman;">s/tatus<span>    </span><span>     </span>Include disk header status [status=TRUE/(FALSE)]</span></span>

<span style="font-size:small;"><span style="font-family:Times New Roman;">v/erbose<span>                </span>KFOD verbose errors [verbose=TRUE/(FALSE)]</span></span>

<span style="font-size:small;font-family:Times New Roman;">KFOD-01000: file not found</span>

 

To discover the disks , you need to use disks=all clause

 

[oracle@asm] /home/oracle&gt;kfod disks=all</span>
kfod disks=all
--------------------------------------------------------------------------------
 Disk          Size Path
==================================================================
   1:     345648 Mb /dev/sda7
--------------------------------------------------------------------------------
ORACLE_SID ORACLE_HOME
==================================================================
     +ASM2 /u01/app/oracle/product/asm10.2
     +ASM1 /u01/app/oracle/product/asm10.2
    

As seen from help menu, we can use raw and asm as option. One more important parameter which can be used is asm_diskstring to make sure that problem is not with initialization parameter.

[oracle@asm]/home/oracle>kfod asm_diskstring='/raw/*' disks=all
--------------------------------------------------------------------------------
ORACLE_SID ORACLE_HOME
====================================================================
     +ASM2 /u01/app/oracle/product/asm10.2
     +ASM1 /u01/app/oracle/product/asm10.2

By default asm_diskstring looks in all the directories.

Application Design and ORA – 4031

Much has been written on designing good application (which uses bind variables) so as to have better performance and avoid shared pool memory issues. Tom Kyte explains this on one of his post consequences of not using bind variables

If you do not use bind variables and you flood the server with
hundreds/thousands of unique queries you will
-run dog slow
-consume a
ton of RAM (and maybe run out)
-not scale beyond a handful of users, if
that
among other really bad side effects.

This is very true and have observed it myself many times. But today we will see one more kind of bad application design which I came across working on a ORA – 4031 issue.

Environment was already using CURSOR_SHARING=SIMILAR (application developers were not ready to change code, so it was Kind of quick fix) to force bind variables but we were still observing ORA – 4031 errors. Query on V$SGASTAT gave us following results

http://askoracledba.wordpress.com/2008/04/27/application-design-and-ora-4031/sgastat/

Out of 4 gb allocated to shared pool, around 2.4 G is being used by sql area and library cache indicating that sql are consuming lot of memory in shared pool.

Whenever I observe high memory usage for Sqlarea, I run following sql statements.

1) select sa.sql_text,sa.version_count ,ss.*from v$sqlarea sa,v$sql_shared_cursor ss where sa.address=ss.KGLHDPAR and sa.version_count > 50 order by sa.version_count ;

– This one is to find if child cursors are being generated.

2) Use script from Asktom website to find if application is using Bind variables. You can find ,more information by clicking here

create table t1 as select sql_text from v$sqlarea;
alter table t1 add sql_text_wo_constants varchar2(1000);
create or replace function
remove_constants( p_query in varchar2 ) return varchar2
as
    l_query long;
    l_char  varchar2(1);
    l_in_quotes boolean default FALSE;
begin
    for i in 1 .. length( p_query )
    loop
        l_char := substr(p_query,i,1);
        if ( l_char = '''' and l_in_quotes )
        then
            l_in_quotes := FALSE;
        elsif ( l_char = '''' and NOT l_in_quotes )
        then
            l_in_quotes := TRUE;
            l_query := l_query  '''#';
        end if;
        if ( NOT l_in_quotes ) then
            l_query := l_query  l_char;
        end if;
    end loop;
l_query := translate( l_query, '0123456789', '@@@@@@@@@@' );
    for i in 0 .. 8 loop
        l_query := replace( l_query, lpad('@',10-i,'@'), '@' );
        l_query := replace( l_query, lpad(' ',10-i,' '), ' ' );
    end loop;
    return upper(l_query);
end;
/
update t1 set sql_text_wo_constants = remove_constants(sql_text);

select sql_text_wo_constants, count(*)
  from t1
 group by sql_text_wo_constants
having count(*) > 100
 order by 2
/

On executing first statement , there were no rows returned. But on executing second query , got following results

SELECT NVL(SUM(FEE),:"SYS_B_@") FROM (SELECT NVL(SUM(FREECALLFEE)+SUM(SALARY)+
SUM(SALARY@)+SUM(FREEINFOFEE),:"SYS_B_@") FEE
FROM SCOTT.A_BL_C@_@_F_@ WHERE MSISDN=:MSISDN UNION ALL
SELECT NVL(SUM(COMM),:"SYS_B_@") FEE FROM SCOTT.T_BL_SM_F_@ WHERE MSISDN=:MSISDN )
150

SELECT NVL(SUM(FEE),:"SYS_B_@") FROM (SELECT NVL(SUM(FREECALLFEE)+
SUM(SALARY)+SUM(SALARY@)+SUM(FREEINFOFEE),:"SYS_B_@") FEE FROM
SCOTT.A_BL_C@_@_G_@ WHERE MSISDN=:MSISDN UNION ALL SELECT NVL(SUM(COMM),:"SYS_B_@") FEE
FROM SCOTT.T_BL_SM_G_@ WHERE MSISDN=:MSISDN )
150

SELECT NVL(SUM(FEE),:"SYS_B_@") FROM (SELECT NVL(SUM(FREECALLFEE)+SUM(SALARY)+
SUM(SALARY@)+SUM(FREEINFOFEE),:"SYS_B_@") FEE FROM
SCOTT.A_BL_C@_@_H_@ WHERE MSISDN=:MSISDN UNION ALL SELECT
NVL(SUM(COMM),:"SYS_B_@") FEE FROM SCOTT.T_BL_SM_H_@ WHERE MSISDN=:MSISDN )

150

TRUNCATE TABLE SCOTT.A_BL_C@_@_L_@
310

TRUNCATE TABLE SCOTT.A_BL_C@_@_M_@
310

TRUNCATE TABLE SCOTT.A_BL_C@_@_P_@
310

Ideally this should not have returned any query as we are forcing bind variables.
But looking closely we find that there around 620tables with name of format SCOTT.A_BL_C@_@_M_@ and SCOTT.A_BL_C@_@_P_@(basically @ will be actually a Numeric or a literal) which is cause for so many versions for sql. Application is running set of 20 queries on these tables , so in total there are 20 X620 queries being run , leading to high usage of shared pool memory.

Apart from that we see that all these tables are being truncated .This actually invalidates the cursors and causes a hard parse along with causing shared pool fragmentation

Based on our findings, we gave following recommendation which could be performed to alleviate (if not resolve) the issue.

– Have the shared pool set to 4 Gb itself
– Turn these 620tables into two paritioned tables. This will avoid multiple sql’s in shared pool. But this can have impact on the time taken for query to run and had to be tested
– Instead of truncate tables, use delete operation so that we do not invalidate the cursor. But again catch here is that we will have more redo generation and if data being loaded uses append operation, size for these tables will grow.

As you see this is basically Application design issue and made a application which is not scalable and also leading to downtime.

Deciding “PROCESSES” parameter value

Many times we observe following error which corresponds to low setting of processes parameter.

$ oerr ora 20

00020, 00000, “maximum number of processes (%s) exceeded”

// *Cause: All process state objects are in use.

// *Action: Increase the value of the PROCESSES initialization parameter.




This is mostly seen in new database installation as the processes parameter is not taken into account.

According to Oracle® Database Reference 10g Release 2 (10.2) Part Number B14237-03

PROCESSES specifies the maximum number of operating system user processes that can simultaneously connect to Oracle. Its value should allow for all background processes such as locks, job queue processes, and parallel execution processes.

The default values of the SESSIONS and TRANSACTIONS parameters are derived from this parameter. Therefore, if you change the value of PROCESSES, you should evaluate whether to adjust the values of those derived parameters.




This nrings us to a question that “Can we set up a large value for this parameter so that user’s are not affected and we do not have to restart the system (PROCESSES is a static parameter) ?”

Answer is NO.



You need to calculate ideal value for the processes parameter by taking care of Default Background processes (Also looking at number of Archiver processes, parallel_max_servers) and total number of user connections.

This is required as otherwise you will be required to change the value of SEMMNS parameter (for UNIX only) i.e Max number of semaphores in system and along with this , it will increase the “Overhead” Memory in SGA.

Find below explanation for the above points.

1)SEMMNS parameter

Each Oracle instance needs to have a set amount of semaphores. The total amount of semaphores required is derived from the ‘processes’ parameter specified inOracle instance init.ora file. As more instances and/or databases are added the OS kernel parameter SEMMNS will need to be adjusted accordingly.

According to Metalink Note:153961.1



In Oracle7 , number of semaphores required by an instance is equal to ‘processes’ parameter in the init.ora for the instance.For Oracle8, Oracle8i, Oracle9i and Oracle10g, number of semaphores required by an instance is equal to 2 times the setting of the ‘processes’ parameter in the init.ora for the instance.

However, Oracle only momentarily grabs 2 X ‘processes’ and then releases half at instance startup.




2)Overhead Memory



Parameters like db_files, open_cursors and processes contribute to the variable part of SGA. This can be seen by issuing “Show SGA” command. You will see that “Variable Size” will be more then sum of “Shared Pool + Large Pool + Java Pool”. This is attibuted to the value of these parameters.

Please note that in case you specify a low value for SGA_MAX_SIZE, you will see Oracle bumping the value to higher value so as to accomodate high value of Overhead memory.

Staring from 10g, Overhead memory is accomodated in shared_pool_size.

e.g If you specify SHARED_POOL_SIZE as 200 MB and your internal overhead is 100 Mb, then your actual shared pool value available to instance is only 100Mb.

You can read Note:351018.1 – Minimum for SHARED_POOL_SIZE Parameter in 10.2 Version for more information.