partitioning

dbms_stats.copy_table_stats does not alter low/high value

I was working on a performance issue arising due to missing table stats on a new partition for large Partitioned table.This was a 10.2.0.4 database on IBM AIX 5L. So as to resolve the issue , I used dbms_stats.copy_table_stats to copy the stats to new parition from the previous partition.

EXEC DBMS_STATS.COPY_TABLE_STATS (‘owner’, ‘tabname’, ‘SourcePart’, ‘TargetPart’)

You can find more details about it in below link

http://optimizermagic.blogspot.com/2009/02/maintaining-statistics-on-large.html

There is one more link explaining this with example

http://dioncho.wordpress.com/2009/02/12/how-to-copy-partition-stats-we-got-easier-way/

Article from Optimizer development group states that the low and high value for the target partition is set from the partition key values. But when I used dbms_stats.copy_table_stats, I found that it is not the case and it actually copies the low and high values from the previous partition.

You can use following function to check the high and low values ( Thanks to Greg Rahn’s article )

create or replace function display_raw (rawval raw, type varchar2)
return varchar2
is
   cn     number;
   cv     varchar2(32);
   cd     date;
   cnv    nvarchar2(32);
   cr     rowid;
   cc     char(32);
begin
   if (type = 'NUMBER') then
      dbms_stats.convert_raw_value(rawval, cn);
      return to_char(cn);
   elsif (type = 'VARCHAR2') then
      dbms_stats.convert_raw_value(rawval, cv);
      return to_char(cv);
   elsif (type = 'DATE') then
      dbms_stats.convert_raw_value(rawval, cd);
      return to_char(cd);
   elsif (type = 'NVARCHAR2') then
      dbms_stats.convert_raw_value(rawval, cnv);
      return to_char(cnv);
   elsif (type = 'ROWID') then
      dbms_stats.convert_raw_value(rawval, cr);
      return to_char(cnv);
   elsif (type = 'CHAR') then
      dbms_stats.convert_raw_value(rawval, cc);
      return to_char(cc);
   else
      return 'UNKNOWN DATATYPE';
   end if;
end;
/

alter session set nls_date_format='DD-MON-YYYY HH24:MI:SS';
col DATA_TYPE for a20
col TABLE_NAME for a25
col low_val for a25
col high_val for a25
col PARTITION_NAME for a15
col COLUMN_NAME for a25
select a.table_name,
   a.column_name,
   display_raw(a.low_value,b.data_type) as low_val,
   display_raw(a.high_value,b.data_type) as high_val,
   b.data_type
from
   dba_tab_col_statistics a, dba_tab_cols b
where
   a.table_name ='&tabname' and
   a.table_name=b.table_name and
   a.column_name=b.column_name and a.owner='&owner' and a.column_name='&colname'
/

select a.table_name,a.partition_name,
   a.column_name,
   display_raw(a.low_value,b.data_type) as low_val,
   display_raw(a.high_value,b.data_type) as high_val,
   b.data_type
from
   dba_part_col_statistics a, dba_tab_cols b
where
   a.table_name ='&tab_name' and
   a.table_name=b.table_name and
   a.column_name=b.column_name and a.owner='&owner' and a.column_name='&colname' and a.partition_name='&part_name'
/

This is attributed to bug 8318020. You need to install one-off patch for same. This is also mentioned in the article from optimizer devlopment group. I am quoting the same

“It is necessary to install the one off patch for bug 8318020 if you are using copy_table_stats procedure or APPROX_GLOBAL option in 10.2.0.4 (patch 8413548) or in 11.1.0.7 (patch 8411968).”

This is important as CBO while calculating the selectivity uses the low/high value . In this case as the values are old (my case last month values), when you try to query with new month data , CBO will treat it as out of range predicate and can pick suboptimal plans.

e.g You can see below when you enable 10053 trace for the query, how the CBO has estimated a very low selectivity for the index (based on the partitioned column) and thus estimating a very low cost. Note that CBO will not immediately pick incorrect selectivity if it goes out of range as it tries to linearly deduce a value within range of density of that column and 0.5/num_rows. Below selectivity is actually the extreme value i.e 0.5/num_rows.

Column (#1): PART_DATE(DATE)
    AvgLen: 8.00 NDV: 21 Nulls: 0 Density: 0.047619 Min: 2454953 Max: 2454977
  Using prorated density: 8.8550e-09 of col #1 as selectivity of out-of-range value pred
Access Path: index (RangeScan)
    Index: IDX_HIST_TABLE
    resc_io: 5.00  resc_cpu: 34401
    ix_sel: 8.8980e-09  ix_sel_with_filters: 8.8980e-09
		Resp: 1.00  Degree: 1

As of now I have used dbms_stats.set_column_stats to set the values but I would need to apply the fix so as to use copy_table_stats which is much easier to use. Note that you will have to set the stats at partition name and also table level to alter the high value to contain the latest partition upper bound.

Update:-
Please find enclosed shell script ( combination of shell and pl/sql..can be better but I am still learning both 🙂 ) to copy index and table stats. It first uses dbms_stats.copy_table_stats and then alters table and partition’s low-high value based on the partition definition. It then copies stats for local indexes.
Inputs

1)Owner
2)Table Name
3)Source partition name
4) Target partition name
5) Column_name (Has to be column of date datatype on which table is partitioned)

One more requirement is that when you execute the script, you should have write privilege in directory where you have copied the script as this will create temp files. You can download the script here

#!/usr/bin/ksh

### Script to Set the stats for  a Range partitioned table which is partitioned on a Date Column#######
### Written by Amit Bansal ######
#### Please test in a test environment before executing in production database############

###### Environment Variables ########
echo "Enter the table owner for which stats need's to be copied : \t"
read l_tab_owner
echo "Enter the table name for which stats need's to be copied : \t"
read l_tab_name
echo "Enter the Source Partition_name from which stats need's to be copied  \t"
read l_src_part
echo "Enter the target Partition_name to which stats need's to be copied : \t"
read l_targ_part
echo "Enter the column_name for range partitioned table (date datatype only) for which stats need's to be copied : \t"
read l_col_type

###################End of User Input #############

echo "You have selected following options"
echo "Table owner \t" $l_tab_owner
echo "Table name \t" $l_tab_name
echo "Source partition_name \t" $l_src_part
echo "Target  partition_name \t" $l_targ_part
echo "Column name for range partititioned  table (date datatype) \t" $l_col_type
cat <<!

		Do you want to proceed Y/N  

!
read z
     case $z in
     Y|y) 
sqlplus -s /nolog <<EOF
conn / as sysdba
set head off
set feedb off
set trims on
set trim on
set long 10000
set longc 10000
spool ${l_src_part}
select HIGH_VALUE from dba_tab_partitions where table_name='${l_tab_name}' and table_owner='${l_tab_owner}' and partition_name in ('${l_src_part}');
spool ${l_targ_part}
select HIGH_VALUE from dba_tab_partitions where table_name='${l_tab_name}' and table_owner='${l_tab_owner}' and partition_name in ('${l_targ_part}');
exit
EOF

vi $l_src_part.lst <<!
:%j!
:wq
!
vi $l_targ_part.lst<<!
:%j!
:wq
!
l_targ_start_dt=`cat $l_src_part.lst`
l_targ_end_dt=`cat  $l_targ_part.lst`

echo $l_targ_start_dt 
echo $l_targ_end_dt

sqlplus -s /nolog <<EOF
conn / as sysdba
exec dbms_stats.copy_table_stats('${l_tab_owner}','${l_tab_name}','${l_src_part}','${l_targ_part}');

set serveroutput on
 declare
 l_srec sys.dbms_stats.statrec;
 l_distcnt number;
 l_density number;
 l_nullcnt number;
 l_avgclen number;
 l_statown varchar2(80);
minvv DATE;
maxvv DATE;
dt date;
maxrv raw(32) := l_srec.maxval;
minrv raw(32) := l_srec.minval;
SREC SYS.DBMS_STATS.STATREC;
DATEVALS SYS.DBMS_STATS.DATEARRAY;
 begin
sys.dbms_stats.get_column_stats('${l_tab_owner}','${l_tab_name}','${l_col_type}',distcnt=>l_distcnt,density=>l_density,nullcnt=>l_nullcnt,srec=>l_srec,avgclen=>l_avgclen,statown=>l_statown);
--dbms_output.put_line('No of Distinct Values::::'||l_distcnt);
--dbms_output.put_line('Density              ::::'||l_density);
--dbms_output.put_line('NullCount            ::::'||l_nullcnt);
--dbms_output.put_line('Average Row Length   ::::'||l_avgclen);
--dbms_output.put_line('Stat Owner           ::::'||l_statown);
maxrv := l_srec.maxval;
minrv := l_srec.minval;
--dbms_output.put_line(maxrv) ;
dbms_stats.convert_raw_value(minrv,minvv) ;
dbms_stats.convert_raw_value(maxrv,maxvv) ;
dbms_output.put_line('PRE MAX VALUE ::'||TO_CHAR(maxvv,'DD-MM-RRRR HH24:MI:SS')) ;
dbms_output.put_line('PRE MIN VALUE ::'||TO_CHAR(minvv,'DD-MM-RRRR HH24:MI:SS')) ;
SREC.EPC:=2;
SREC.EAVS:=NULL;
--DATEVALS:= SYS.DBMS_STATS.DATEARRAY(to_date('18-06-2000 00:00:00','DD-MM-YYYY HH24:MI:SS'),${l_targ_end_dt} -1/(24*60*60));
DATEVALS:= SYS.DBMS_STATS.DATEARRAY(minvv,${l_targ_end_dt} -1/(24*60*60));
SYS.DBMS_STATS.PREPARE_COLUMN_VALUES(SREC,DATEVALS);
SYS.DBMS_STATS.SET_COLUMN_STATS('${l_tab_owner}','${l_tab_name}','${l_col_type}',SREC=>SREC,NO_INVALIDATE =>false);
COMMIT;
 sys.dbms_stats.get_column_stats('${l_tab_owner}','${l_tab_name}','${l_col_type}',distcnt=>l_distcnt,density=>l_density,nullcnt=>l_nullcnt,srec=>l_srec,avgclen=>l_avgclen,statown=>l_statown);

maxrv := l_srec.maxval;
minrv := l_srec.minval;

dbms_stats.convert_raw_value(minrv,minvv) ;
dbms_stats.convert_raw_value(maxrv,maxvv) ;

dbms_output.put_line('POST MAX VALUE ::'||TO_CHAR(maxvv,'DD-MM-RRRR HH24:MI:SS')) ;
dbms_output.put_line('POST MIN VALUE ::'||TO_CHAR(minvv,'DD-MM-RRRR HH24:MI:SS')) ;

END;
/

set serveroutput on
 declare
 l_srec sys.dbms_stats.statrec;
 l_distcnt number;
 l_density number;
 l_nullcnt number;
 l_avgclen number;
 l_statown varchar2(80);
minvv DATE;
maxvv DATE;
dt date;
maxrv raw(32) := l_srec.maxval;
minrv raw(32) := l_srec.minval;
SREC SYS.DBMS_STATS.STATREC;
DATEVALS SYS.DBMS_STATS.DATEARRAY;
 begin
 sys.dbms_stats.get_column_stats('${l_tab_owner}','${l_tab_name}','${l_col_type}','${l_src_part}',distcnt=>l_distcnt,density=>l_density,nullcnt=>l_nullcnt,srec=>l_srec,avgclen=>l_avgclen,statown=>l_statown);
--dbms_output.put_line('No of Distinct Values::::'||l_distcnt);
--dbms_output.put_line('Density              ::::'||l_density);
--dbms_output.put_line('NullCount            ::::'||l_nullcnt);
--dbms_output.put_line('Average Row Length   ::::'||l_avgclen);
--dbms_output.put_line('Stat Owner           ::::'||l_statown);
maxrv := l_srec.maxval;
minrv := l_srec.minval;
--dbms_output.put_line(maxrv) ;
dbms_stats.convert_raw_value(minrv,minvv) ;
dbms_stats.convert_raw_value(maxrv,maxvv) ;
dbms_output.put_line('PRE MAX VALUE ::'||TO_CHAR(maxvv,'DD-MM-RRRR HH24:MI:SS')) ;
dbms_output.put_line('PRE MIN VALUE ::'||TO_CHAR(minvv,'DD-MM-RRRR HH24:MI:SS')) ;
SREC.EPC:=2;
SREC.EAVS:=NULL;
--DATEVALS:= SYS.DBMS_STATS.DATEARRAY(to_date('01-10-2009 00:00:00','DD-MM-YYYY HH24:MI:SS'),to_date('30-11-2009 23:59:59','DD-MM-YYYY HH24:MI:SS'));
DATEVALS:= SYS.DBMS_STATS.DATEARRAY(${l_targ_start_dt},${l_targ_end_dt} -1/(24*60*60));
SYS.DBMS_STATS.PREPARE_COLUMN_VALUES(SREC,DATEVALS);
SYS.DBMS_STATS.SET_COLUMN_STATS('${l_tab_owner}','${l_tab_name}','${l_col_type}','${l_targ_part}',SREC=>SREC,NO_INVALIDATE =>false);

COMMIT;

 sys.dbms_stats.get_column_stats('${l_tab_owner}','${l_tab_name}','${l_col_type}','${l_targ_part}',distcnt=>l_distcnt,density=>l_density,nullcnt=>l_nullcnt,srec=>l_srec,avgclen=>l_avgclen,statown=>l_statown);

maxrv := l_srec.maxval;
minrv := l_srec.minval;

dbms_stats.convert_raw_value(minrv,minvv) ;
dbms_stats.convert_raw_value(maxrv,maxvv) ;

dbms_output.put_line('POST MAX VALUE ::'||TO_CHAR(maxvv,'DD-MM-RRRR HH24:MI:SS')) ;
dbms_output.put_line('POST MIN VALUE ::'||TO_CHAR(minvv,'DD-MM-RRRR HH24:MI:SS')) ;

END;
/

declare
 l_source_part varchar2(30);
 l_target_part varchar2(30);
n_numrows number;
n_numlblks number;
n_numdist number;
n_avglblk number;
n_avgdblk number;
n_clstfct number;
n_indlevel number;
n_guessq number;
cursor ind_list is select index_name from dba_part_indexes where table_name='${l_tab_name}' and owner='${l_tab_owner}' and LOCALITY='LOCAL';
ind_list_rec ind_list%ROWTYPE;
begin
open ind_list;
LOOP
FETCH ind_list into ind_list_rec;
EXIT WHEN ind_list%NOTFOUND;
select partition_name into l_source_part from dba_ind_partitions where index_name=ind_list_rec.index_name and index_owner='${l_tab_owner}' and partition_name like '%${l_src_part}';
select partition_name into l_target_part from dba_ind_partitions where index_name=ind_list_rec.index_name and index_owner='${l_tab_owner}' and partition_name like '%${l_targ_part}';
dbms_stats.get_index_stats
        (
          ownname      => '${l_tab_owner}',
          indname      => ind_list_rec.index_name,
          partname     => l_source_part,
          numrows      => n_numrows,
          numlblks     => n_numlblks,
          numdist      => n_numdist,
          avglblk      => n_avglblk,
          avgdblk      => n_avgdblk,
          clstfct      => n_clstfct,
          indlevel     => n_indlevel,
  guessq       => n_guessq  );
dbms_output.put_line('.');
     dbms_output.put_line('Setting Index Stats for index '||ind_list_rec.index_name||' partition '||l_target_part|| ' to following values');
	 dbms_output.put_line('----------------------------------------------------------------');
      dbms_output.put_line('.     NumRows = ' || n_numrows );
      dbms_output.put_line('.    NumLBlks = ' || n_numlblks);
      dbms_output.put_line('.   NumDstnct = ' || n_numdist);
      dbms_output.put_line('.     AvgLBlk = ' || n_avglblk);
      dbms_output.put_line('.     AvgDBlk = ' || n_avgdblk);
      dbms_output.put_line('.     ClstFct = ' || n_clstfct);
      dbms_output.put_line('.    IndLevel = ' || n_indlevel);
      dbms_output.put_line('.   GuessQual = ' || n_guessq);
      dbms_output.put_line('.');
			  dbms_stats.set_index_stats
        (
          ownname      => '${l_tab_owner}',
          indname      => ind_list_rec.index_name,
          partname     => l_target_part,
          numrows       => n_numrows,
          numlblks      => n_numlblks,
          numdist       => n_numdist,
          avglblk       => n_avglblk,
          avgdblk       => n_avgdblk,
          clstfct       => n_clstfct,
          indlevel      => n_indlevel,
          no_invalidate => false,
  guessq       => n_guessq  );
  end loop;
  close ind_list;
end;
/
exit
EOF
;;
N|n) echo "Exiting Script"
;;

*) sleep 1
;;

esac

Hope this helps