New Features

11g New Feature: Database Replay

Database Replay:

In critical production systems, before making changes to the environment( Hardware or software related), lots of testing is required in test systems to know the effect of those changes and this usually takes lots of time to test for the stability and effects of the change.

In Oracle 11g this thing is simplified and a new feature “database replay” is added which can be used to simulate the production workload on a test system.

Following are the three stages of DATABASE REPLAY:
1. Workload Capture
2. Workload Preprocessing
3. Workload Replay

1. Workload Capture:

Before starting the capture process you need to take care of following things:

Login as sys user

It is always recommened (not mandatory) to shutdown the database and then start the database in restricted mode. By doing so, you will be making sure that no transactions are running on the database before capture process is actually started. It will guarantee the complete workload capture and you will be running full transactions during database replay on test system.
If you do not follow this step then you have to be prepared for some divergance of results during database replay on test system.

SQL> shutdown immediate
Database closed.
Database dismounted.
ORACLE instance shut down
SQL> startup restrict
ORACLE instance started.

Total System Global Area  313860096 bytes
Fixed Size                  1299624 bytes
Variable Size             184552280 bytes
Database Buffers          121634816 bytes
Redo Buffers                6373376 bytes
Database mounted.
Database opened.
SQL > select LOGINS from v$instance;
LOGINS
----------
RESTRICTED

Important thing to note here is that, after starting the capture process the database will automatically be switched into UNRESTRICTED mode and the users will be able to connect normally.

After capturing the initial database information, the capture process will internally fire a statement to disable restricted sessions, this can be verified by setting 10046 trace before starting the capture process as:

SQL> alter session set events '10046 trace name context forever, level 4';

Now start the capture process.

SQL> alter session set events '10046 trace name context off';

You will see following information:

PARSING IN CURSOR #21 len=39 dep=1 uid=0 oct=49 lid=0 tim=121695902847 hv=14656808 ad='0' sqlid='7shphjbg23h8'
alter system disable restricted session
END OF STMT
PARSE #21:c=0,e=6,p=0,cr=0,cu=0,mis=0,r=0,dep=1,og=0,tim=121695384843

Now check the information from database:

SQL > select LOGINS from v$instance;
LOGINS
----------
ALLOWED

– Create a directory to hold the workload information.

SQL > create directory DIR as '/tmp/ss';

Check the permission of directory ss at OS level and make sure that oracle is able to write to this directory.
Also make sure that the directory is empty, before starting the capture process.

– Define Workload Filters
By adding filters we can restrict the Workload Capture process to be specific to certain user sessions. By default all the sessions will be captured.

There are two types of filters i.e INCLUDE and EXCLUDE.

If INCLUDE is used then the filter will be active and workload for the objects mentioned in the filter will not be captured rather all other user’s workload Will be captured i.e the objects defined in the filter will be FILTERED OUT from capture process.

If EXCLUDE is used, only the workload done by objects defined in the filter will be captured, no other client request to database will be captured.

EXAMPLE:
========

SQL> exec dbms_workload_capture.ADD_FILTER( fname  IN VARCHAR2, fattribute  IN VARCHAR2,fvalue IN VARCHAR2);

fname=Name of the filter.
fattribute=Attribute on which the filter will be applied i.e USER, PROGRAM, INSTANCE_NUMBER etc.
fvalue= value for the corresponding attribute.

SQL > exec dbms_workload_capture.ADD_FILTER( fname =>'FILTER_SCOTT',fattribute => 'USER',fvalue => 'SCOTT');

– Starting workload capture:

To start the workload capture START_CAPTURE procedure is used as:

SQL > exec dbms_workload_capture.start_capture(NAME => 'PEAK_LOAD_TEST',DIR => 'DIR', DURATION => 600);

Name,Dir are mandatory parameters while duration (in seconds) is optional, here the capture will run for 600 seconds, if the duration is not specified then you have to finish the capture manually as:

SQL > dbms_workload_capture.finish_capture();

– While the capture process is runnig it will generate two files : wcr_scapture.wmd and wcr_cap_000xx.start

– After using the finish_capture procedure you will get folloing files also: wcr_cr.html and wcr_cr.text

wcr_cr.html is similar to the AWR report generated by awrrpt.sql

During capture process database initiate a AWR report for the duration of capture period.

SQL>  select id,AWR_BEGIN_SNAP,AWR_END_SNAP from dba_workload_captures;

<span style="font-family: arial,helvetica,sans-serif; font-size: small;">ID AWR_BEGIN_SNAP AWR_END_SNAP
---------- -------------- ------------
1             20           21
2             29           30

– Exporting AWR data for the workload capture:

It will enable detailed analysis of the workload:

SQL> exec DBMS_WORKLOAD_CAPTURE.EXPORT_AWR (capture_id => 2);

After this you can see two more files generated under the specified directory:

wcr_ca.dmp and wcr_ca.log

2. Workload Preprocessing:

Once the workload is captured it can be replayed on the test system. For this you need to copy the capture files to the test system. Preprocessing a workload will convert the captured files into REPLAY files and create the necessary metadata for this. Preprocessing will put load on the system and is resource intensive and should be performed only on test systems and on the same version of oracle database.

– Create a new directory for the capture files placed on the test system.
Let us suppose that the capture files are placed under /tmp/ss on test system, now create a directory as:

SQL > create directory REPLAY as '/tmp/ss';

SQL>EXEC DBMS_WORKLOAD_REPLAY.PROCESS_CAPTURE(capture_dir => 'REPLAY');

After preprocessing followings files will be generated:

wcr_process.wmd, wcr_login.pp, wcr_seq_data.extb, wcr_scn_order.extb , wcr_conn_data.extb

3. Workload Replay:

You have to use an instance which is logically identical to the instance whose workload is captured and need to setup Replay Client (WRC).

– wrc executable has to be run in calibrate mode to estimate the number of replay clients.

$  wrc mode=calibrate replaydir=/tmp/sood

Workload Replay Client: Release 11.1.0.6.0 - Production on Sat Jul 26 18:49:29 2008

Copyright (c) 1982, 2007, Oracle.  All rights reserved

Report for Workload in: /tmp/sood
-----------------------

Recommendation:
Consider using at least 1 clients divided among 1 CPU(s).

Workload Characteristics:
- max concurrency: 1 sessions
- total number of sessions: 5

Assumptions:
- 1 client process per 50 concurrent sessions
- 4 client process per CPU
- think time scale = 100
- connect time scale = 100
- synchronization = TRUE

– Initialize Replay Data:


Initializing replay loads the metadata into tables required for database replay.

SQL>; exec DBMS_WORKLOAD_REPLAY.INITIALIZE_REPLAY (replay_name => 'REPLAY_1', REPLAY_DIR=> 'REPLAY');

– Put the database in “PREPARE REPLAY” mode as:

SQL > exec DBMS_WORKLOAD_REPLAY.PREPARE_REPLAY()

To check the status of replay use:

SQL > select name,status from  dba_workload_replays;
NAME                 STATUS
-------------------- ----------------------------------------
REPLAY_1             PREPARE

— Start the replay client:

Workload Replay Client: Release 11.1.0.6.0 - Production on Sat Jul 26 22:12:35 2008

Copyright (c) 1982, 2007, Oracle.  All rights reserved.

Wait for the replay to start (22:12:35)

– Start the replay as:

SQL> exec DBMS_WORKLOAD_REPLAY.START_REPLAY ();

Check the status of the replay as:

SQL> select name,status from  dba_workload_replays;

Initially it will show :
NAME                 STATUS
-------------------- ----------------------------------------
REPLAY_1             IN PROGRESS

And after that it will be shown as completed.

– Once the status is completed finish the replay as:

SQL> exec DBMS_WORKLOAD_REPLAY.CANCEL_REPLAY ();

To generate the report of the replay in HTML format you need to run :

DECLARE
My_Report  CLOB;
BEGIN
My_Report := DBMS_WORKLOAD_REPLAY.REPORT(replay_id => 1, format => DBMS_WORKLOAD_REPLAY.TYPE_HTML);
END;

11G:SYSASM role

Starting from Oracle 11g ASM, new privilege has been created for Administering Oracle ASM instances. This is called SYSASM role.

You can continue using SYSDBA role to connect to ASM but it will generate following warning messages at time of startup/shutdown, create Diskgroup/add disk, etc.

<strong>$ adrci</strong>

ADRCI: Release 11.1.0.6.0 - Beta on Tue Jul 1 15:43:57 2008
Copyright (c) 1982, 2007, Oracle.  All rights reserved.

ADR base = "/u03/app/oracle"
adrci&gt; help
adrci&gt; set homes
DIA-48431: Must specify at least one ADR home path

<strong>adrci&gt; show homes</strong>
ADR Homes:
diag/rdbms/testdb11/TESTDB11
diag/asm/+asm/+ASM
diag/clients/user_oracle/host_2900411789_11
diag/tnslsnr/testzone/listener
<strong>adrci&gt; set home diag/asm/+asm/+ASM
adrci&gt; show alert</strong>

ADR Home = /u03/app/oracle/diag/asm/+asm/+ASM:
******************************************
Starting ORACLE instance (normal)
<strong>WARNING: Deprecated privilege SYSDBA for command 'STARTUP'</strong>
2008-06-25 13:13:09.287000 +05:30

Please note that SYSASM privilege cannot be used to start the RDBMS instance. This way it ensures separation of database administration privilege from the ASM storage administration privilege. Trying to do so results in ORA – 1031

$ sqlplus
SQL*Plus: Release 11.1.0.6.0 - Production on Tue Jul 1 16:09:54 2008
Copyright (c) 1982, 2007, Oracle.  All rights reserved.
Enter user-name: / as sysasm
Connected to an idle instance.

SQL&gt; startup pfile=initest.ora
<strong>ORA-01031: insufficient privileges</strong>

As per documentation, the privilege to administer an ASM instance with SYSDBA will be removed in future versions. Anyways till now DBA’s can administer the ASM instance πŸ˜‰

New ASM Background Processes in 11G

Few hours back I installed Oracle Database 11g(Though still DB is yet to be created), so I started playing with the ASM instance. First thing I did was to check ASM alert.log. I used ADRCI (11g) to see it

adrci&gt; show incident

ADR Home = /u03/app/oracle/diag/asm/+asm/+ASM:
*********************************************************
0 rows fetched

adrci&gt; show alert

ADR Home = /u03/app/oracle/diag/asm/+asm/+ASM:
**********************************************************
Output the results to file: /tmp/alert_9572_1_+ASM_1.ado
"/tmp/alert_9572_1_+ASM_1.ado" 48 lines, 1964 characters
PMON started with pid=2, OS id=3672
DIAG started with pid=4, OS id=3678
<strong>VKTM started with pid=3, OS id=3674
VKTM running at (100ms) precision</strong>
2008-06-24 15:24:12.425000 +05:30
PSP0 started with pid=5, OS id=3680
<strong>DSKM started with pid=6, OS id=3682</strong>
<strong>DIA0 started with pid=7, OS id=3684</strong>
MMAN started with pid=6, OS id=3686
DBW0 started with pid=8, OS id=3689
LGWR started with pid=9, OS id=3691
CKPT started with pid=10, OS id=3694
SMON started with pid=11, OS id=3700
RBAL started with pid=12, OS id=3702
GMON started with pid=13, OS id=3705
ORACLE_BASE from environment = /u03/app/oracle
<strong>Spfile /u03/app/oracle/product/11.1.0/db_1/dbs/spfile+ASM.ora is in old pre-11 format and compatible &gt;= 11.0.0; converting to  new H.A.R.D. compliant format.</strong>

I have highlighted the things which were not present in 10g. According to Docs

DIA0 (diagnosability process 0) (only 0 is currently being used) is responsible for hang detection and deadlock resolution.

VKTM (virtual keeper of time) is responsible for providing a wall-clock time (updated every second) and reference-time counter (updated every 20 ms and available only when running at elevated priority)

These were the definitions from docs. Oracle should have been more generous and also documented following

DIA0 – Does that mean we will have auto SystemState/Hanganalyze generated during hang? Will ORA-60 be handled by this process?

VKTM – What does this mean to us? Will this timer be used in 10046 timing information? Will it ensure Oracle Scheduler run jobs on time πŸ™‚

DSKM – This is still not documented.

Last message kind of indicates that spfile is also made H.A.R.D complaint. So it should take care of corruptions.

I have created more confusion rather than explaining what these processes actually do πŸ™‚ Anyways if you have any information, then it will be really nice if it can be shared…Thanks for Reading !!

Understanding of Oracle 11g Automatic Diagnostic Repository – Part 1

INTRODUCTION

In an endeavour to move towards self managed database and provide better support solutions, Oracle 11g introduces another new feature in 11g called the Automatic Diagnostic Repository or ADR. As the name implies ADR is a automatically managed file based repository of diagnostic information such as trace files, core dumps, alert log, etc… ADR is like a mini database managing traces of multiple instances and multiple prouducts (database, ASM, CRS, etc…) with metadata of these traces being maintained under a unified directory structure. ADR promises to use consistent diagnostic data structures across products. Set of tools bundled with ADR facilitate us in analyzing diagnostic data.

As an outcome of this feature:
– Critical errors captured in the diagnostic repository show up as alerts in EM. In cases where the issue is a known issue, Oracle recommends corrective actions to be implemented by DBA.
– It is intended to make it easy for DBAs to consolidate and transfer diagnostic data to Oracle support, thus facilitating faster resolution of the issue.

ARCHITECTURE

Starting with Oracle 11g R1, traditional *_DUMP_DEST parameters are ignored and the trace location is decided based on DIAGNOSTIC_DEST parameter. If this parameter is not set explicitly, it defaults to ORACLE_BASE environment variable if this is set, if this is not set, then it defaults to ORACLE_HOME/log.

DIAGNOSTIC_DEST is the ADR root directory also known as ADR base.

Each instance of each product stores diagnostic data in its own ADR home directory. For example, in a Real Application Clusters (RAC) environment with shared storage and ASM, each database instance and each ASM instance have a home directory within ADR.

Under ADR base are multiple ADR homes. ADR home is the root directory for all traces within an instance of a product. i.e. Each instance of a product has an ADR home. Directroy structure of ADR is pictorial represented as below:

Under the ADR base there will be a folder “diag”. Within this folder there will be seperate folders for each product like rdbsm, asm, crs, clients, lsnrctl, netcman, etc… Each of these folders will have seperate folders for every instance of the product.

For an instance of Oracle database, the structure is as depicted. All traces that used to be within USER_DUMP_DEST and BACKGROUND_DUMP_DEST are with trace folder. There is a seperate folder for alert log. A copy of the alert log in XML format is mainted in the alert folder.

Problem and Incident are two terms used frequently in ADR. All errors like the ora-600, ora-7445 or any other error like ora-470 that crashes database instance are termed as critical error. A critical error is termed as a Problem and is given a problem id by ADR based on it’s symptoms. Each occurrence of a critical error is termed as an Incident.

Incident directory contains multiple subdirectories, where each subdirectory is named for a particular incident, and where each contains dumps pertaining only to that incident.

Metadata directory contains ADR’s metadata.

HM directory contains health monitor reports which are part of ADR.

ADRCI – ADR command interpreter and EM support work bench are the two interfaces to ADR.

We will discuss these tools in our next post.