I was working on a performance issue arising due to missing table stats on a new partition for large Partitioned table.This was a 10.2.0.4 database on IBM AIX 5L. So as to resolve the issue , I used dbms_stats.copy_table_stats to copy the stats to new parition from the previous partition.
EXEC DBMS_STATS.COPY_TABLE_STATS (‘owner’, ‘tabname’, ‘SourcePart’, ‘TargetPart’)
You can find more details about it in below link
http://optimizermagic.blogspot.com/2009/02/maintaining-statistics-on-large.html
There is one more link explaining this with example
http://dioncho.wordpress.com/2009/02/12/how-to-copy-partition-stats-we-got-easier-way/
Article from Optimizer development group states that the low and high value for the target partition is set from the partition key values. But when I used dbms_stats.copy_table_stats, I found that it is not the case and it actually copies the low and high values from the previous partition.
You can use following function to check the high and low values ( Thanks to Greg Rahn’s article )
create or replace function display_raw (rawval raw, type varchar2) return varchar2 is cn number; cv varchar2(32); cd date; cnv nvarchar2(32); cr rowid; cc char(32); begin if (type = 'NUMBER') then dbms_stats.convert_raw_value(rawval, cn); return to_char(cn); elsif (type = 'VARCHAR2') then dbms_stats.convert_raw_value(rawval, cv); return to_char(cv); elsif (type = 'DATE') then dbms_stats.convert_raw_value(rawval, cd); return to_char(cd); elsif (type = 'NVARCHAR2') then dbms_stats.convert_raw_value(rawval, cnv); return to_char(cnv); elsif (type = 'ROWID') then dbms_stats.convert_raw_value(rawval, cr); return to_char(cnv); elsif (type = 'CHAR') then dbms_stats.convert_raw_value(rawval, cc); return to_char(cc); else return 'UNKNOWN DATATYPE'; end if; end; / alter session set nls_date_format='DD-MON-YYYY HH24:MI:SS'; col DATA_TYPE for a20 col TABLE_NAME for a25 col low_val for a25 col high_val for a25 col PARTITION_NAME for a15 col COLUMN_NAME for a25 select a.table_name, a.column_name, display_raw(a.low_value,b.data_type) as low_val, display_raw(a.high_value,b.data_type) as high_val, b.data_type from dba_tab_col_statistics a, dba_tab_cols b where a.table_name ='&tabname' and a.table_name=b.table_name and a.column_name=b.column_name and a.owner='&owner' and a.column_name='&colname' / select a.table_name,a.partition_name, a.column_name, display_raw(a.low_value,b.data_type) as low_val, display_raw(a.high_value,b.data_type) as high_val, b.data_type from dba_part_col_statistics a, dba_tab_cols b where a.table_name ='&tab_name' and a.table_name=b.table_name and a.column_name=b.column_name and a.owner='&owner' and a.column_name='&colname' and a.partition_name='&part_name' /
This is attributed to bug 8318020. You need to install one-off patch for same. This is also mentioned in the article from optimizer devlopment group. I am quoting the same
“It is necessary to install the one off patch for bug 8318020 if you are using copy_table_stats procedure or APPROX_GLOBAL option in 10.2.0.4 (patch 8413548) or in 11.1.0.7 (patch 8411968).”
This is important as CBO while calculating the selectivity uses the low/high value . In this case as the values are old (my case last month values), when you try to query with new month data , CBO will treat it as out of range predicate and can pick suboptimal plans.
e.g You can see below when you enable 10053 trace for the query, how the CBO has estimated a very low selectivity for the index (based on the partitioned column) and thus estimating a very low cost. Note that CBO will not immediately pick incorrect selectivity if it goes out of range as it tries to linearly deduce a value within range of density of that column and 0.5/num_rows. Below selectivity is actually the extreme value i.e 0.5/num_rows.
Column (#1): PART_DATE(DATE) AvgLen: 8.00 NDV: 21 Nulls: 0 Density: 0.047619 Min: 2454953 Max: 2454977 Using prorated density: 8.8550e-09 of col #1 as selectivity of out-of-range value pred Access Path: index (RangeScan) Index: IDX_HIST_TABLE resc_io: 5.00 resc_cpu: 34401 ix_sel: 8.8980e-09 ix_sel_with_filters: 8.8980e-09 Resp: 1.00 Degree: 1
As of now I have used dbms_stats.set_column_stats to set the values but I would need to apply the fix so as to use copy_table_stats which is much easier to use. Note that you will have to set the stats at partition name and also table level to alter the high value to contain the latest partition upper bound.
Update:-
Please find enclosed shell script ( combination of shell and pl/sql..can be better but I am still learning both 🙂 ) to copy index and table stats. It first uses dbms_stats.copy_table_stats and then alters table and partition’s low-high value based on the partition definition. It then copies stats for local indexes.
Inputs
1)Owner
2)Table Name
3)Source partition name
4) Target partition name
5) Column_name (Has to be column of date datatype on which table is partitioned)
One more requirement is that when you execute the script, you should have write privilege in directory where you have copied the script as this will create temp files. You can download the script here
#!/usr/bin/ksh ### Script to Set the stats for a Range partitioned table which is partitioned on a Date Column####### ### Written by Amit Bansal ###### #### Please test in a test environment before executing in production database############ ###### Environment Variables ######## echo "Enter the table owner for which stats need's to be copied : \t" read l_tab_owner echo "Enter the table name for which stats need's to be copied : \t" read l_tab_name echo "Enter the Source Partition_name from which stats need's to be copied \t" read l_src_part echo "Enter the target Partition_name to which stats need's to be copied : \t" read l_targ_part echo "Enter the column_name for range partitioned table (date datatype only) for which stats need's to be copied : \t" read l_col_type ###################End of User Input ############# echo "You have selected following options" echo "Table owner \t" $l_tab_owner echo "Table name \t" $l_tab_name echo "Source partition_name \t" $l_src_part echo "Target partition_name \t" $l_targ_part echo "Column name for range partititioned table (date datatype) \t" $l_col_type cat <<! Do you want to proceed Y/N ! read z case $z in Y|y) sqlplus -s /nolog <<EOF conn / as sysdba set head off set feedb off set trims on set trim on set long 10000 set longc 10000 spool ${l_src_part} select HIGH_VALUE from dba_tab_partitions where table_name='${l_tab_name}' and table_owner='${l_tab_owner}' and partition_name in ('${l_src_part}'); spool ${l_targ_part} select HIGH_VALUE from dba_tab_partitions where table_name='${l_tab_name}' and table_owner='${l_tab_owner}' and partition_name in ('${l_targ_part}'); exit EOF vi $l_src_part.lst <<! :%j! :wq ! vi $l_targ_part.lst<<! :%j! :wq ! l_targ_start_dt=`cat $l_src_part.lst` l_targ_end_dt=`cat $l_targ_part.lst` echo $l_targ_start_dt echo $l_targ_end_dt sqlplus -s /nolog <<EOF conn / as sysdba exec dbms_stats.copy_table_stats('${l_tab_owner}','${l_tab_name}','${l_src_part}','${l_targ_part}'); set serveroutput on declare l_srec sys.dbms_stats.statrec; l_distcnt number; l_density number; l_nullcnt number; l_avgclen number; l_statown varchar2(80); minvv DATE; maxvv DATE; dt date; maxrv raw(32) := l_srec.maxval; minrv raw(32) := l_srec.minval; SREC SYS.DBMS_STATS.STATREC; DATEVALS SYS.DBMS_STATS.DATEARRAY; begin sys.dbms_stats.get_column_stats('${l_tab_owner}','${l_tab_name}','${l_col_type}',distcnt=>l_distcnt,density=>l_density,nullcnt=>l_nullcnt,srec=>l_srec,avgclen=>l_avgclen,statown=>l_statown); --dbms_output.put_line('No of Distinct Values::::'||l_distcnt); --dbms_output.put_line('Density ::::'||l_density); --dbms_output.put_line('NullCount ::::'||l_nullcnt); --dbms_output.put_line('Average Row Length ::::'||l_avgclen); --dbms_output.put_line('Stat Owner ::::'||l_statown); maxrv := l_srec.maxval; minrv := l_srec.minval; --dbms_output.put_line(maxrv) ; dbms_stats.convert_raw_value(minrv,minvv) ; dbms_stats.convert_raw_value(maxrv,maxvv) ; dbms_output.put_line('PRE MAX VALUE ::'||TO_CHAR(maxvv,'DD-MM-RRRR HH24:MI:SS')) ; dbms_output.put_line('PRE MIN VALUE ::'||TO_CHAR(minvv,'DD-MM-RRRR HH24:MI:SS')) ; SREC.EPC:=2; SREC.EAVS:=NULL; --DATEVALS:= SYS.DBMS_STATS.DATEARRAY(to_date('18-06-2000 00:00:00','DD-MM-YYYY HH24:MI:SS'),${l_targ_end_dt} -1/(24*60*60)); DATEVALS:= SYS.DBMS_STATS.DATEARRAY(minvv,${l_targ_end_dt} -1/(24*60*60)); SYS.DBMS_STATS.PREPARE_COLUMN_VALUES(SREC,DATEVALS); SYS.DBMS_STATS.SET_COLUMN_STATS('${l_tab_owner}','${l_tab_name}','${l_col_type}',SREC=>SREC,NO_INVALIDATE =>false); COMMIT; sys.dbms_stats.get_column_stats('${l_tab_owner}','${l_tab_name}','${l_col_type}',distcnt=>l_distcnt,density=>l_density,nullcnt=>l_nullcnt,srec=>l_srec,avgclen=>l_avgclen,statown=>l_statown); maxrv := l_srec.maxval; minrv := l_srec.minval; dbms_stats.convert_raw_value(minrv,minvv) ; dbms_stats.convert_raw_value(maxrv,maxvv) ; dbms_output.put_line('POST MAX VALUE ::'||TO_CHAR(maxvv,'DD-MM-RRRR HH24:MI:SS')) ; dbms_output.put_line('POST MIN VALUE ::'||TO_CHAR(minvv,'DD-MM-RRRR HH24:MI:SS')) ; END; / set serveroutput on declare l_srec sys.dbms_stats.statrec; l_distcnt number; l_density number; l_nullcnt number; l_avgclen number; l_statown varchar2(80); minvv DATE; maxvv DATE; dt date; maxrv raw(32) := l_srec.maxval; minrv raw(32) := l_srec.minval; SREC SYS.DBMS_STATS.STATREC; DATEVALS SYS.DBMS_STATS.DATEARRAY; begin sys.dbms_stats.get_column_stats('${l_tab_owner}','${l_tab_name}','${l_col_type}','${l_src_part}',distcnt=>l_distcnt,density=>l_density,nullcnt=>l_nullcnt,srec=>l_srec,avgclen=>l_avgclen,statown=>l_statown); --dbms_output.put_line('No of Distinct Values::::'||l_distcnt); --dbms_output.put_line('Density ::::'||l_density); --dbms_output.put_line('NullCount ::::'||l_nullcnt); --dbms_output.put_line('Average Row Length ::::'||l_avgclen); --dbms_output.put_line('Stat Owner ::::'||l_statown); maxrv := l_srec.maxval; minrv := l_srec.minval; --dbms_output.put_line(maxrv) ; dbms_stats.convert_raw_value(minrv,minvv) ; dbms_stats.convert_raw_value(maxrv,maxvv) ; dbms_output.put_line('PRE MAX VALUE ::'||TO_CHAR(maxvv,'DD-MM-RRRR HH24:MI:SS')) ; dbms_output.put_line('PRE MIN VALUE ::'||TO_CHAR(minvv,'DD-MM-RRRR HH24:MI:SS')) ; SREC.EPC:=2; SREC.EAVS:=NULL; --DATEVALS:= SYS.DBMS_STATS.DATEARRAY(to_date('01-10-2009 00:00:00','DD-MM-YYYY HH24:MI:SS'),to_date('30-11-2009 23:59:59','DD-MM-YYYY HH24:MI:SS')); DATEVALS:= SYS.DBMS_STATS.DATEARRAY(${l_targ_start_dt},${l_targ_end_dt} -1/(24*60*60)); SYS.DBMS_STATS.PREPARE_COLUMN_VALUES(SREC,DATEVALS); SYS.DBMS_STATS.SET_COLUMN_STATS('${l_tab_owner}','${l_tab_name}','${l_col_type}','${l_targ_part}',SREC=>SREC,NO_INVALIDATE =>false); COMMIT; sys.dbms_stats.get_column_stats('${l_tab_owner}','${l_tab_name}','${l_col_type}','${l_targ_part}',distcnt=>l_distcnt,density=>l_density,nullcnt=>l_nullcnt,srec=>l_srec,avgclen=>l_avgclen,statown=>l_statown); maxrv := l_srec.maxval; minrv := l_srec.minval; dbms_stats.convert_raw_value(minrv,minvv) ; dbms_stats.convert_raw_value(maxrv,maxvv) ; dbms_output.put_line('POST MAX VALUE ::'||TO_CHAR(maxvv,'DD-MM-RRRR HH24:MI:SS')) ; dbms_output.put_line('POST MIN VALUE ::'||TO_CHAR(minvv,'DD-MM-RRRR HH24:MI:SS')) ; END; / declare l_source_part varchar2(30); l_target_part varchar2(30); n_numrows number; n_numlblks number; n_numdist number; n_avglblk number; n_avgdblk number; n_clstfct number; n_indlevel number; n_guessq number; cursor ind_list is select index_name from dba_part_indexes where table_name='${l_tab_name}' and owner='${l_tab_owner}' and LOCALITY='LOCAL'; ind_list_rec ind_list%ROWTYPE; begin open ind_list; LOOP FETCH ind_list into ind_list_rec; EXIT WHEN ind_list%NOTFOUND; select partition_name into l_source_part from dba_ind_partitions where index_name=ind_list_rec.index_name and index_owner='${l_tab_owner}' and partition_name like '%${l_src_part}'; select partition_name into l_target_part from dba_ind_partitions where index_name=ind_list_rec.index_name and index_owner='${l_tab_owner}' and partition_name like '%${l_targ_part}'; dbms_stats.get_index_stats ( ownname => '${l_tab_owner}', indname => ind_list_rec.index_name, partname => l_source_part, numrows => n_numrows, numlblks => n_numlblks, numdist => n_numdist, avglblk => n_avglblk, avgdblk => n_avgdblk, clstfct => n_clstfct, indlevel => n_indlevel, guessq => n_guessq ); dbms_output.put_line('.'); dbms_output.put_line('Setting Index Stats for index '||ind_list_rec.index_name||' partition '||l_target_part|| ' to following values'); dbms_output.put_line('----------------------------------------------------------------'); dbms_output.put_line('. NumRows = ' || n_numrows ); dbms_output.put_line('. NumLBlks = ' || n_numlblks); dbms_output.put_line('. NumDstnct = ' || n_numdist); dbms_output.put_line('. AvgLBlk = ' || n_avglblk); dbms_output.put_line('. AvgDBlk = ' || n_avgdblk); dbms_output.put_line('. ClstFct = ' || n_clstfct); dbms_output.put_line('. IndLevel = ' || n_indlevel); dbms_output.put_line('. GuessQual = ' || n_guessq); dbms_output.put_line('.'); dbms_stats.set_index_stats ( ownname => '${l_tab_owner}', indname => ind_list_rec.index_name, partname => l_target_part, numrows => n_numrows, numlblks => n_numlblks, numdist => n_numdist, avglblk => n_avglblk, avgdblk => n_avgdblk, clstfct => n_clstfct, indlevel => n_indlevel, no_invalidate => false, guessq => n_guessq ); end loop; close ind_list; end; / exit EOF ;; N|n) echo "Exiting Script" ;; *) sleep 1 ;; esac
Hope this helps
Well its a nice post.It helped me a lot while preparing partitions in the data base..the code given is working..thanks for posting this..
Bob,
Thanks for your comments. Do you want to say that dbms_stats.copy_table_stats does not copy the histogram's ? I have not personally checked the same yet.
Regards
Amit
dbms_stats.copy_table_stats copies the histogram values but their range is not adjusted e.g. if we copy the stats from the partition for 12th JULY to 13th JULY then the srec.novals(1) and (2) remain at 12th JULY rather than 13th JULY and it is this that is throwing the optimiser.
Can send you the code we use if you want to take a look.
Hi Amit,
Nice post…
I am having a partition table based on number column….
Can you please share the scripot for number columns also.
-Prasad