Checking column usage information using dbms_stats

Oracle 11gR2 DBMS_STATS introduced useful function report_col_usage to report column usage i.e if column is being used for equality,like predicates. I came to know it while reading Optimizer whitepaper .As per document, DBMS_STATS.REPORT_COL_USAGE reports column usage information and records all the SQL operations the database has processed for a given object.

This information is used by Oracle gather stats job to decide whether to collect histograms and also number of buckets to be used.So if we specify method_opt =>’FOR ALL COLUMS SIZE AUTO’, oracle stats job will make use of this information. A column is a candidate for a histogram if it has been seen in a where clause predicate, e.g., an equality, range, LIKE, etc.Column usage tracking is enabled by default.

Apart from this, I see two more benefits

a)It can be used to see if all indexed column are being used. If not we can get rid of any extra indexes.

b) In case you are using method_opt =>’FOR ALL INDEXED COLUMNS SIZE AUTO’ in your stats collection script, it can tell additional columns which are being used in where clause. This is really important as this method_opt option does not capture statistics on non-indexed columns. Result can be wrong cardinality estimates and choice of wrong join methods. You can read detailed explanation on Greg Rahn’s post

To see this action I am using EMP,DEPT,SALGRADE table. You can find the scripts for creating these tables on scribd .

Since these are newly created tables, there was no column usage stats. Running the function confirms same

SELECT DBMS_STATS.REPORT_COL_USAGE('AMIT','EMP') FROM DUAL;
LEGEND:
 .......
EQ : Used in single table EQuality predicate
 RANGE : Used in single table RANGE predicate
 LIKE : Used in single table LIKE predicate
 NULL : Used in single table is (not) NULL predicate
 EQ_JOIN : Used in EQuality JOIN predicate
 NONEQ_JOIN : Used in NON EQuality JOIN predicate
 FILTER : Used in single table FILTER predicate
 JOIN : Used in JOIN predicate
 GROUP_BY : Used in GROUP BY expression
 ...............................................................................
###############################################################################
COLUMN USAGE REPORT FOR AMIT.EMP
 ................................
###############################################################################

I ran following set of queries once

select * from emp where empno=7782;
 select ename,job,dname from emp,dept where emp.deptno=dept.deptno;
 select e.ename,e.sal,s.grade from emp e,salgrade s where e.sal between s.losal and s.hisal;

Querying again gives following output now

###############################################################################
COLUMN USAGE REPORT FOR AMIT.EMP
 ................................
1. DEPTNO : EQ_JOIN
 2. EMPNO : EQ
 3. SAL : NONEQ_JOIN
 ###############################################################################

Ran two queries with like and range predicates

select * from emp where ename like 'A%';
 select * from emp where sal >3000;

You can see that ENAME column shows that we have like in query predicate and for SAL a range comparison has been made.

###############################################################################
COLUMN USAGE REPORT FOR AMIT.EMP
 ................................
1. DEPTNO : EQ_JOIN
 2. EMPNO : EQ
 3. ENAME : LIKE
 4. SAL : RANGE NONEQ_JOIN
 ###############################################################################

You can reset the column usage information using following procedure

exec dbms_stats.reset_col_usage('AMIT','EMP')

Note that this deletes the recorded column usage information from dictionary which can have impact on Stats job, so be careful while you are deleting it. There are two more procedures which can help to seed column usage information from other database.

SEED_COL_USAGE – This procedure iterates over the SQL statements in the specified SQL tuning set, compiles them and seeds column usage information for the columns that appear in these statements.

Syntax
DBMS_STATS.SEED_COL_USAGE (sqlset_name IN VARCHAR2, owner_name IN VARCHAR2, time_limit IN POSITIVE DEFAULT NULL);

MERGE_COL_USAGE – This procedure merges column usage information from a source database by means of a dblink into the local database. If column usage information already exists for a given table or column MERGE_COL_USAGE will combine both the local and the remote information.

Syntax
DBMS_STATS.MERGE_COL_USAGE (dblink IN VARCHAR2);
Tags: , ,

2 thoughts on “Checking column usage information using dbms_stats”

  • Hi Amit,

    Thanks for sharing this information. Its really helpful. I have a doubt regarding reset_col_usage where actually as said it deletes the recorded column usage, but then doesn’t it start restoring again.I tried, but its wasn’t recording back so bounced the test database and then it started recording back.

    • I tried it on 11.2.0.3 db on linux x86-64 and it worked for me for all except equality predicates.
      Then i changed the literal being passed i.e
      select * from emp where empno=7782;
      changed to
      select * from emp where empno=7499;

      COLUMN USAGE REPORT FOR AMIT.EMP
      …………………………..

      1. DEPTNO : EQ_JOIN
      2. EMPNO : EQ
      3. SAL : NONEQ_JOIN

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.