10g

Automatic Storage Management (ASM) FAQ

We have tried to consolidate list of Frequently Asked Questions (FAQ) on ASM. Please feel free to ask any other question or contribute to this FAQ by using comment section

Frequently Asked Question on ASM

1) What is ASM?

Automatic storage Management (ASM) is a new type of filesystem which was introduced with Oracle 10g. ASM is recommended filesystem for RAC and Single instance ASM for storing database files. This provides direct I/O to the file and performance is comparable with that provided by RAW Devices. Oracle creates a separate instance for this purpose.

2) How do we identify if we are connected to Normal Instance or ASM instance?

Issue following command to identify this

<span style="font-size: small; font-family: arial,helvetica,sans-serif;">
SQL&gt; show parameter instance_type</span>

<span style="font-size: small; font-family: arial,helvetica,sans-serif;">NAME TYPE VALUE
------------------------------------ ----------- ------------------------------
instance_type string asm</span>

In case you are connected to ASM instance, it will display value as asm, otherwise it will display value as RDBMS.

3) What are Diskgroups and Failuregroups?

Diskgroup is a terminology used for logical structure which holds the database files. Each Diskgroup consists of Disks/Raw devices where the files are actually stored. Any ASM file is completely contained within a single disk group. However, a disk group might contain files belonging to several databases and a single database can use files from multiple disk groups.

Failuregroups are used when using Normal/High Redundancy. They contain the mirrored ASM extents and must be containing different disks and preferably on separate disk controller.

4)Can ASM be used as replacement for RAID?

ASM is supposed to stripe the data and also mirror the data (if Using Normal, High Redundancy). So this can be used as a alternative for RAID 0+1 solutions

5) How does ASM provides Redundancy?

When you create a disk group, you specify an ASM disk group type based on one of the following three redundancy levels:

  • Normal for 2-way mirroring – When ASM allocates an extent for a normal redundancy file, ASM allocates a primary copy and a secondary copy. ASM chooses the disk on which to store the secondary copy in a different failure group other than the primary copy.
  • High for 3-way mirroring. In this case the extent is mirrored across 3 disks.
  • External to not use ASM mirroring. This is used if you are using Third party Redundancy mechanism like RAID, Storage arrays.

6) Can we change the Redundancy for Diskgroup after its creation.

No, we cannot modify the redundancy for Diskgroup once it has been created. To alter it we will be required to create a new Diskgroup and move the files to it. This can also be done by restoring full backup on the new Diskgroup. Following metalink note describes the steps

Note.438580.1 – How To Move The Database To Different Diskgroup (Change Diskgroup Redundancy)

7) I am unable to open the ASM instance. What is the reason?

ASM instance does not have open stage. It has got only two options

  • Nomount- This starts the ASM instance
  • Mount- At this stage, Diskgroup defined in ASM_DISKGROUPS parameter are mounted

When you try to open the ASM instance , you get following error

<span style="font-size: small; font-family: arial,helvetica,sans-serif;">SQL&gt; alter database open;
alter database open
*
ERROR at line 1:
ORA-15000: command disallowed by current instance type</span>

8)Can ASM instance and database (rdbms) be on different servers?

ASM instance and Database (rdbms) have to be present on same server. Otherwise it will not work.


9)
Can we see the files stored in the ASM instance using standard unix commands.

No, you cannot see the files using standard unix commands like ls. You need to use utility called asmcmd to do this. This is present in 10.2 and above.e.g
/home/oracle>asmcmd
Asmcmd>

You can use help command to see the options.

<span style="font-size: small; font-family: arial,helvetica,sans-serif;">Note: - You can use asmcmd for 10.1 database also. For this you can copy the </span><span style="font-size: small; font-family: arial,helvetica,sans-serif;">asmcmdcore and asmcmd file from 10.2 ORACLE_HOME to 10.1 ORACLE_HOME. Else you can download the files from Metalink </span><span style="font-size: small; font-family: arial,helvetica,sans-serif;"><strong>Note:332853.1</strong></span>

10) What kind of files can be stored on ASM Diskgroup.

You can store the following file types in ASM disk groups:

  • Control files
  • Datafiles, temporary datafiles, and datafile copies
  • SPFILEs
  • Online redo logs, archive logs, and Flashback logs
  • RMAN backups
  • Disaster recovery configurations
  • Change tracking bitmaps
  • Data Pump dumpsets

Note: Oracle executables and ASCII files, such as alert logs and trace files, cannot be stored in ASM disk groups.

11)Can we use ASM for storing Voting Disk/OCR in a RAC instance?

No. You cannot use ASM for storing the voting disk and OCR. It is due to the fact that Clusterware starts before ASM instance and it should be able to access these files which is not possible if you are storing it on ASM. You will have to use raw devices or OCFS or any other shared storage.

12)Does ASM instance automatically rebalances and takes care of hot spots?

No. This is a myth and ASM does not do it. It will initiate automatic rebalance only when a new disk is added to Diskgroup or we drop a disk from existing Diskgroup.

13) What is ASMLIB?


ASMLIB is the support library for the ASM. ASMLIB allows an Oracle database using ASM more efficient and capable access to diskgroups. The purpose of ASMLIB, is to provide an alternative interface to identify and access block devices. Additionally, the ASMLIB API enables storage and operating system vendors to supply extended storage-related features. You can read more about ASMLIB in

Note: 359266.1 – FAQ ASMLIB CONFIGURE, VERIFY, TROUBLESHOOT

ASMLIB for linux can be downloaded from following link

http://www.oracle.com/technology/tech/linux/asmlib/index.html

14) What is SYSASM role?
Starting from Oracle 11g, SYSASM role can be used to administer the ASM instances. You can continue using SYSDBA role to connect to ASM but it will generate following warning messages at time of startup/shutdown, create Diskgroup/add disk ,etc

<span style="font-size: small; font-family: arial,helvetica,sans-serif;">Alert entry
WARNING: Deprecated privilege SYSDBA for command 'STARTUP'</span>

15) How can we copy the files from/to ASM?

You can use RMAN or DBMS_FILE_TRANSFER.COPY_FILE procedure to copy the files to/from ASM from/to Filesystem. Starting from Oracle 11g, you can use cp command in asmcmd to perform the same between ASM Diskgroups and also to OS Filesystem.

16) Can we use BCV to clone the ASM Diskgroup on same host?

Diskgroup Cloning is not supported on the same host using BCV. You have no other option to use except RMAN DUPLICATE. You can find more detail on BCV and ASM in below whitepaper

http://www.oracle.com/technology/products/database/asm/pdf/asm-on-emc-5_3.pdf

17) Can we edit the ASM Disk header to change the Diskgroup Name?

No. This cannot be done. Please find details in

http://askdba.org/weblog/?p=128

18) Whats is Kfed?

kfed is a utility which can be used to view the ASM Disk information. Syntax for using it is

kfed read devicename

Read more about kfed here.

19) Can we use block devices for ASM Disks?

Yes. Starting from Oracle Database 10.2 block devices can be used directly for ASM Disks in Linux. This is not true for other Unix based systems where block devices are not supported yet.

Along with this it is recommended to use a Device mapping functionality so that disk mapping is preserved after disk failure. This is important when you have devices as /dev/sda,/dev/sdb,/dev/sdc and due to some reason the devices are not detected at next reboot (say /dev/sdb), the system will map the incorrect device (i.e /dev/sdc will be marked as /dev/sdb). You can use following methods for preserving disk names

-udev – the role of udev is to provide device persistency and naming consistency.This is especially important for the Oracle Cluster Registry (OCR) and Voting disks required by Oracle Clusterware.

ASMLIB – ASMLIB will provide device management specifically for ASM disk devices.

Refer to following Whitepaper for more details

http://www.oracle.com/technology/products/database/asm/pdf/device-mapper-udev-crs-asm%20rh4.pdf

20)Is it mandatory to use disks of same size and characteristics for Diskgroups?

No, it is not mandatory to use the disks of same size and characteristics for Diskgroups though it is a Recommended Practice.

Same size disk for Failuregroups in Normal/High redundancy will prevent issues like ORA-15041 as the file extents needs to be mirrored across the disks. Also as Oracle distributes data based on capacity, so larger disk will have more data stored in it and which will result in higher I/O to disk and eventually can lead to sub-optimal performance.

Moreover having disks of different characteristic like varying disk speed can impact the performance.

When managing disks with different size and performance capabilities, best practice is to group them into disk groups according to their characteristics. So you can use higher speed disks for your database files while other disks can be part of Diskgroup used for Flash Recovery Area.

21)Do we need to install ASM and Oracle Database Software in different ORACLE_HOME?

No. Again installing ASM and Oracle Database Software in different ORACLE_HOME is not mandatory but a best practice. This is useful in cases when we need to have multiple databases using same ASM instance and you need to patch only one of them. E.g You need to apply a CBO patch to one of 10.2 database while your other 10.1 database using different installation does not require it. In this case having a ASM_HOME separate from 10.2 ORACLE_HOME will allow your 10.1 database to keep running. Thus this approach is useful for High Availability.

22)What is the maximum size of Disk supported by ASM?

ASM supports disks upto 2Tb, so you need to ensure that lun size should be less then 2Tb. 10.2.0.4 and 11g database will give error if you try to create a diskgroup with ASM disks having disk size >2Tb.

23)I have created Oracle database using DBCA and having a different home for ASM and Oracle Database. I see that listener is running from ASM_HOME. Is it correct?

This is fine. When using different home for ASM, you need to run the listener from ASM_HOME instead of ORACLE_HOME.

Additional Links for Further Reading

http://www.oracle.com/technology/products/database/asm/index.html http://download.oracle.com/docs/cd/B28359_01/server.111/b31107/toc.htm

http://www.oracle.com/technology/products/database/asm/pdf/device-mapper-udev-crs-asm%20rh4.pdf
Note:359266.1 – FAQ ASMLIB CONFIGURE,VERIFY, TROUBLESHOOT
Note 266028.1 – ASM Using Files Instead of Real Devices on Linux
Note.438580.1 – How To Move The Database To Different Diskgroup (Change Diskgroup Redundancy)

High Database Sessions

Yesterday I was working on issue where our Monitoring system reported increased number of database connections. We were already aware of this issue and application team was asked to look into it.

On checking GV$RESOURCE_LIMIT, we found that it had touched the limit value on Instance 1 (since startup).

 SELECT INST_ID "ID",RESOURCE_NAME "RESOURCE",CURRENT_UTILIZATION
CURRENT",MAX_UTILIZATION "MAX",INITIAL_ALLOCATION "INITIAL",LIMIT_VALUE FROM
V$RESOURCE_LIMIT WHERE RESOURCE_NAME='sessions';

ID RESOURCE CURRENT MAX  INITIAL LIMIT_VALUE
--  -------   --------- ----- ----     -----------
1   sessions     232       335   335        335
2   sessions     245       314   335        335

This was quite surprising as we did not get any Critical tickets. (I think this is because polling setting for tool for this parameter was set to 10 minutes).

I checked the alert log for any ORA-18 errors on Instance 1, but could not find any error. I had faced this situation sometime back (with different client) when user complained about getting ORA-18 error but during that time too, there were no entries in Alert log. It could be that ORA-18 errors are not recorded (unlike ORA -20, number of processes exceeding errors).

Since it was not recorded in Alert log (nor any trace file), I turned to AWR to help me out.In this case view WRH$_RESOURCE_LIMIT was the one which we needed to look as it stores statistics captured using AWR snapshots.

SQL> col BEGIN_INTERVAL_TIME for a28
SQL> col END_INTERVAL_TIME for a28
SQL>col RESOURCE_NAME for a10
SELECT A.SNAP_ID,A.INSTANCE_NUMBER "ID",B.BEGIN_INTERVAL_TIME,B.END_INTERVAL_TIME,A.RESOURCE_NAME,
CURRENT_UTILIZATION "CURRENT",MAX_UTILIZATION "MAX"
FROM WRH$_RESOURCE_LIMIT A, WRM$_SNAPSHOT B
WHERE A.RESOURCE_NAME LIKE '%session%'
AND A.SNAP_ID=B.SNAP_ID
AND A.INSTANCE_NUMBER= B.INSTANCE_NUMBER
AND ((A.INSTANCE_NUMBER=1
AND B.BEGIN_INTERVAL_TIME > (SELECT STARTUP_TIME FROM GV$INSTANCE WHERE INSTANCE_NUMBER=1)) OR
(A.INSTANCE_NUMBER=2 AND B.BEGIN_INTERVAL_TIME >
(SELECT STARTUP_TIME FROM GV$INSTANCE WHERE INSTANCE_NUMBER=2)))
order by A.INSTANCE_NUMBER,A.SNAP_ID;

SNAP ID BEGIN_INTERVAL_TIME       END_INTERVAL_TIME          RESOURCE CUR   MAX
---- ---------- ---------------    ------------------------  --------- ---  ---

936  1 28-JUN-08 05.00.04.133 AM  28-JUN-08 06.00.14.342 AM  sessions  212  297
937  1 28-JUN-08 06.00.14.342 AM  28-JUN-08 07.00.54.865 AM  sessions  255  323
938  1 28-JUN-08 07.00.54.865 AM  28-JUN-08 08.00.18.080 AM  sessions  227  323
939  1 28-JUN-08 08.00.18.080 AM  28-JUN-08 09.00.29.881 AM  sessions  101  335

936  2 28-JUN-08 05.00.03.797 AM  28-JUN-08 06.00.14.117 AM  sessions  233  292
937  2 28-JUN-08 06.00.14.117 AM  28-JUN-08 07.00.54.521 AM  sessions  247  310
938  2 28-JUN-08 07.00.54.521 AM  28-JUN-08 08.00.17.725 AM  sessions  253  314
939  2 28-JUN-08 08.00.17.725 AM  28-JUN-08 09.00.29.648 AM  sessions  101  314

We cannot get the exact time, but we can see that number of Database sessions reached 335 sessions on Instance 1 between 28-JUN-08 08.00.18.080 AM AND 28-JUN-08 09.00.29.881

Similarly for Instance 2, number of Database sessions reached to value of 314 between 28-JUN-08 07.00.54.521 AM AND 28-JUN-08 08.00.17.725 AM

This information was sufficient at this time to ask Application support team to check from their end. Though it would have been really nice to find the exact time. Also finding which users were having most sessions would have also helped to point the module.

ORA-01426 After Upgrade to 10g

Oracle seems to have a tendency to make small changes to database code which impacts application in a big way. Sad part is, these changes are not always documented properly. One of these cases is what I encountered. A simple PL/SQL code errors out as below:

Oracle Database 10g Enterprise Edition Release 10.2.0.1.0 - Production
With the Partitioning, OLAP and Data Mining options

SQL> declare
2 v_code NUMBER(20);
3 BEGIN
4 v_code := 990 * 10000000;
5 END;
6 /
declare
*
ERROR at line 1:
ORA-01426: numeric overflow
ORA-06512: at line 4

Same block works fine in any previous versions like 9i and 8i. As per Oracle this is “Expected Behavior”. Oracle used to perform number arithmetic. In 10g integer arithmetic is being used. Any application using such code would require to undergo change as follows:

SQL> ed
Wrote file afiedt.buf

1 declare
2 v_v1 number :=990;
3 v_v2 number :=10000000;
4 v_code NUMBER(20);
5 BEGIN
6 v_code := v_v1 * v_v2 ;
7* END;
SQL> /

PL/SQL procedure successfully completed.

SQL> ed
Wrote file afiedt.buf

1 declare
2 v_v1 number :=990;
3 v_code NUMBER(20);
4 BEGIN
5 v_code := v_v1 * 10000000 ;
6* END;
SQL> /

PL/SQL procedure successfully completed.

SQL> declare
2 v_code NUMBER(20);
3 BEGIN
4 v_code := 990 * 10000000.0;
5 END;
6 /

PL/SQL procedure successfully completed.

Using any of the above methods, error is not observed. It is although not clear why oracle made this change. Another hurdle in moving to 10g!!. Hope Oracle provides Database capture feature on 9i and Replay on 10g.

Can ASM DiskGroup Be Renamed?

This was actually a question on Oracle forum which I had replied to. Basically the Poster, wanted to know if he could rename the ASM Diskgroup name by renaming/editing ASM Disk header. He had also mentioned that he had heard about this being done by Oracle for its few customers using kfed.

Answer is NO. It is not possible to rename the diskgroup by editing the ASM disk header. kfed is known to be used for patching ASM disk headers for corruption (only oracle support can do it) and for viewing ASM header contents. Only way to change this by dropping and recreating the diskgroup.

In case you wish to create a new diskgroup with a name say +DG1 which was being used by a Diskgroup which is not mounted (Have some ASM Disk members still in ASM_DISKSTRING path), then you would face following error

    <strong>ORA-15030</strong>: diskgroup name "string" is in use by another diskgroup
    <strong>Cause:</strong> A CREATE DISKGROUP command specified a Diskgroup name that was already assigned to another diskgroup.
    <strong>Action: </strong>Select a different name for the Diskgroup.

In case you wish to create the Diskgroup with same name +DG1 you will be required to clear the ASM disk header using

dd if=/dev/zero of=/dev/raw/raw11 bs=1024 count=100

After this you can recreate the Diskgroup with same name.

One Poster suggested renaming at LUN/Storage level. I believed this to be a destructive idea which could corrupt the Diskgroup. nvengurl replied to this and informed that we read ASM disk header to mount the Diskgroup and thus changing the LUN name/id/path will not solve the issue.

Update: This article is valid for 10g and 11gR1. Since 11gR2 oracle has introduced renamedg utility which can be used to rename diskgroup. I have documented the steps here

Adding new ASM disk to RAC database fails

Many times i came across a common problem in RAC databases where trying to add an asm disk is not possible due to errors like

ORA-15075 “disk(s) are not visible cluster-wide”

ORA-15020 “discovered duplicate ASM disk “DISK1” and

ORA-15054 “disk “ORCL:DISK1” does not exist in diskgroup “DG1”.

Rebalancing the diskgroup and trying to add the disk with “FORCE” option also does not help in this case.

I will be discussing how to come out of a situation like this i.e When you are trying to add an asm disk in cluster environment and it says that disk is already added and when trying to drop the same disk it says that disk is not present in the diskgroup.

Lets start from the very begining:
I have decided to add an asm disk in RAC environment to an already existing diskgroup DATA1.

Login to asm instance “/ as sysdba”
SQL > ALTER DISKGROUP DATA1 ADD DISK ‘/dev/rdsk/c1t2d3s4’;

But it failed with following error:

ALTER DISKGROUP DATA1 ADD DISK '/dev/rdsk/c1t2d3s4';*
ERROR at line 1:
<strong>ORA-15032: not all alterations performed
ORA-15075: disk(s) are not visible cluster-wide</strong>

This is due to the fact that the physical disk partition is not visible from all RAC nodes. Then i contacted the sysadmins to make sure that the disk is visible from all RAC nodes and accessible by ORACLE. They have fixed the problem and now the disk /dev/rdsk/c1t2d3s4 can be seen from all RAC nodes. Then i tried to add the disk again using force option as:

SQL > ALTER DISKGROUP DATA1 ADD DISK ‘/dev/rdsk/c1t2d3s4’ force;
But it failed with following error:

ORA-15020: discovered duplicate ASM disk “/dev/rdsk/c1t2d3s4”

It shows that disk with same name is already present in the diskgroup.

As it shows that the disk is already present in the diskgroup, while trying to drop the disk i got following error:

SQL&gt; alter diskgroup DATA1 drop disk '/dev/rdsk/c1t2d3s4';
alter diskgroup DATA1 drop disk '/dev/rdsk/c1t2d3s4'
*
ERROR at line 1:
<strong>ORA-15032 : not all alterations performed
ORA-15054 : disk "/dev/rdsk/c1t2d3s4" does not exist in diskgroup "DATA1"</strong>

Now I cannot move further as adding and dropping the disk is not possible here. Then I decided to check the status of the disk from v$asm_disk from all RAC nodes, to do this issue following query:

SQL > col name format a15
SQL > col path format a20
SQL > select GROUP_NUMBER,DISK_NUMBER,MOUNT_STATUS,HEADER_STATUS,NAME,PATH from v$asm_disk;

We Obain following results from all the nodes :

G# D# HEADER_STATU MOUNT_S STATE NAME PATH
—- —- ———— ——- ——– ———— ————————-
0 0 MEMBER IGNORED NORMAL /dev/rdsk/c1t2d3s4

Header_status=MEMBER means that the disk is a valid asm disk on all RAC nodes.
Mount_status=IGNORED means that Disk is present in the system, but is ignored by ASM.

Group_number=0 This is the number used when a disk is not mounted by a diskgroup.

Now by checking the dd output of the disk as :

$dd if=/dev/rdsk/c1t2d3s4 of=/tmp/disk.out bs=4096 count=1096

$ vi /tmp/disk.out

I found that the diskgroup name and disk number allocated to this disk, which confirms that the disk is now a part of diskgroup DATA1.

But from the results of the header_status,mount_status and group_number it is clear that the disk is partially added to RAC asm instances. To correct this we will have to clear the disk header to add it again:

# dd if=/dev/null of=/dev/rdsk/c1t2d3s4 bs=4096 count=5000

This command cleared the disk header and after that disk was added successfully.

Note: – Please note that using dd will clear the ASM header and should be used only after confirming the disk. Using it on a wrong disk can cause Diskgroup to dismount and lead to Data Loss.

Transparent Application Failover – TAF

Transparent Application Failover (TAF) is a client-side feature that allows for clients to reconnect to surviving databases in the event of a failure of a database instance.

Note that this is not used for purpose of load balancing or for Connect Time Failover.

 

TAF operates in two modes

 – Session Failover which will recreate lost connections and sessions

– Select Failover which will replay queries that were in progress. It will discard the old rows which have been fetched earlier and will fetch the rest of rows.

 TAF can be implemented at

 – Client Side

Server Side

 Client Side

 This can be done by creating a entry in tnsnames.ora as follows

 

TESTDB10_basic=
(DESCRIPTION=
<strong>(LOAD_BALANCE=on)
 (FAILOVER=on) </strong>
(ADDRESS= (PROTOCOL=tcp) HOST=prod01-vip)(PORT=1521))
(ADDRESS= (PROTOCOL=tcp)(HOST= prod02-vip)PORT=1521))
(CONNECT_DATA=
 (SERVICE_NAME=TESTDB10)
<strong>(FAILOVER_MODE= (TYPE=select)
(METHOD=basic</strong>))))

Here FAILOVER_MODE parameter is used to implement TAF.

Above configuration is for connections which will allow new connections to be created at failover time. This is determined by parameter METHOD which is set to BASIC here.

You can also have a session established before. This helps as new connection can take time and will thus result in faster failover. To do this use

TESTDB_prod01=
(DESCRIPTION=
(LOAD_BALANCE=on)
(FAILOVER=on)
(ADDRESS= (PROTOCOL=tcp) HOST=prod01-vip)(PORT=1521))
(CONNECT_DATA=
(SERVICE_NAME=TESTDB)
(INSTANCE_NAME=TESTDB1)
(FAILOVER_MODE= <strong>(BACKUP= TESTDB_prod02)
(TYPE=select) (METHOD=PRECONNECT)</strong>)))

 TESTDB_prod02=
(DESCRIPTION=
(LOAD_BALANCE=on)
 (FAILOVER=on)
 (ADDRESS= (PROTOCOL=tcp) HOST=prod01-vip)(PORT=1521))
 (CONNECT_DATA=
 (SERVICE_NAME=TESTDB10)
(INSTANCE_NAME=TESTDB2)
 (FAILOVER_MODE=
 <strong>(BACKUP= TESTDB_prod01)
(TYPE=select)
(METHOD=PRECONNECT)</strong>)))

Following query can be used to monitor information about failed over sessions

SELECT MACHINE, FAILOVER_TYPE, FAILOVER_METHOD, FAILED_OVER, COUNT(*)
FROM V$SESSION GROUP BY MACHINE, FAILOVER_TYPE, FAILOVER_METHOD, FAILED_OVER;

Server Side Configuration

 

This can be done using server-side service attributes. Please note that in case both (Client side and server side) configuration are used then service-side settings will be used.

 

You can refer to following note for details

 

Note 404644.1 – Configuration of Transparent Application Failover(TAF) works with server side service

Features of TAF

 – Sessions executing any insert/update/delete statement fail then statement will be rollback.

A command which has been successfully completed upon failure and has changed the database state, then TAF does not resend the command.

When using select failover, we will retrieve only the rows which have not been fetched by earlier sessions.