This post discusses issue faced by me while working on Materialized view (mview) creation at new site(database). Starting Oracle 10g, how atomic_refresh works with complete refresh has undergone a change.
Atomic_refresh parameter is to allow either all of the tasks of a transaction are performed or none of them (Atomicity). If set to TRUE, then all refreshes are done in one transaction. If set to FALSE, then the refresh of each specified materialized view is done in a separate transaction.If set to FALSE, Oracle can optimize refresh by using parallel DML and truncate DDL on a materialized views.
In case of 9i (atomic_refresh =>true), if a single mview was refreshed , it used to be truncated and then refreshed (which actually was not atomic). In case of multiple mviews being refreshed (part of refresh group), rows were deleted and then the insert was done.
As part of change in 10g, if atomic_refresh is set to true (By defualt) , rows will be deleted one by one and then insert will take place for both single mview or multiple mviews in refresh group. So if you need to truncate the mview before then set atomic_refresh =>false (default True) while doing a complete refresh.
Due to this change you will find many issues like high redo generation and longer time for complete refresh of mviews using dbms_mview.refresh as now in 10g it will go for deleting the rows and not truncate.
In our case a DBA issued a complete refresh without setting atomic_refresh to false for mview with size of 195Gb (plus 8 indexes ). After 16 hours of running, refresh job was still deleting the rows and had not started inserting the rows. At this moment we decided to kill the refresh session. As this session had generated a lot of undo, smon kicked in to recover the transaction.
On checking the V$FAST_START_TRANSACTION , it reported that it will take around 2 months to perform the rollback.
SQL> select usn, state, undoblockstotal "Total", undoblocksdone "Done", undoblockstotal-undoblocksdone "ToDo", decode(cputime,0,'unknown',sysdate+(((undoblockstotal-undoblocksdone) / (undoblocksdone / cputime)) / 86400)) "Estimated time to complete" 2 from v$fast_start_transactions; USN STATE Total Done ToDo Estimated time to co ---------- ---------------- ---------- ---------- ---------- -------------------- 51 RECOVERING 3514498 354 3514144 12-SEP-2009 02:52:36 SQL> select * from V$fast_start_servers; STATE UNDOBLOCKSDONE PID XID ----------- -------------- ---------- ---------------- RECOVERING 354 32 0033000200001C81 RECOVERING 0 33 0033000200001C81 RECOVERING 0 85 0033000200001C81 RECOVERING 0 38 0033000200001C81 RECOVERING 0 39 0033000200001C81 RECOVERING 0 40 0033000200001C81 RECOVERING 0 42 0033000200001C81 RECOVERING 0 43 0033000200001C81 RECOVERING 0 44 0033000200001C81 RECOVERING 0 45 0033000200001C81 RECOVERING 0 46 0033000200001C81 ---- some output truncated ------- RECOVERING 0 68 0033000200001C81 RECOVERING 0 69 0033000200001C81 RECOVERING 0 53 0033000200001C81 RECOVERING 0 78 0033000200001C81 RECOVERING 0 79 0033000200001C81 RECOVERING 0 80 0033000200001C81 RECOVERING 0 81 0033000200001C81 RECOVERING 0 82 0033000200001C81 RECOVERING 0 83 0033000200001C81
fast_start_parallel_rollback was set to value of low. We see 50 processes were spawned and only one of the parallel server was doing the recovery indicating that the parallel servers might be interfering with each other. Metalink Note 144332.1 (- Parallel Rollback may hang database, Parallel query servers get 100% cpu) discusses this behavior. Now I decided to use the serial recovery by changing fast_start_parallel_rollback to false. But to do this , we had to disable the SMON to do transaction recovery using event 10513 at level 2.
SQL> oradebug setorapid 22 Unix process pid: 2728024, image: oracle@ods1 (SMON) SQL> oradebug Event 10513 trace name context forever, level 2 Statement processed. SQL> select 'kill -9 '||spid||' ' from V$process where pid in (select pid from V$FAST_START_SERVERS); 'KILL-9'||SPID||'' --------------------- kill -9 3014818 kill -9 3010772 kill -9 2916434 kill -9 2887716 kill -9 2678958 kill -9 2511030 kill -9 2224314 kill -9 2142210 kill -9 2822282 kill -9 2625696 kill -9 2506808 kill -9 2486520 kill -9 2314492 kill -9 2310186 kill -9 2752764 kill -9 2445478 kill -9 2326692 kill -9 2457716 kill -9 2654394 kill -9 2621630 kill -9 2580502 kill -9 2633960 kill -9 2412686 alter system set fast_start_parallel_rollback=false; SQL> select usn, state, undoblockstotal "Total", undoblocksdone "Done", undoblockstotal-undoblocksdone "ToDo", decode(cputime,0,'unknown',sysdate+(((undoblockstotal-undoblocksdone) / (undoblocksdone / cputime)) / 86400)) "Estimated time to complete" 2 from v$fast_start_transactions; USN STATE Total Done ToDo Estimated time to co ---------- ---------------- ---------- ---------- ---------- -------------------- 51 RECOVERING 3513444 6002 3507442 06-JUL-2009 17:58:03 SQL> select * from V$FAST_START_SERVERS; no rows selected
Using serial recovery reduced estimated time to 5 hours.
To summarize, if you are still in process of upgrading 9i database to 10g database, then revisit your shell scripts and oracle dbms_jobs used for performing complete refresh of mviews and set atomic_refresh =>false explicitly to truncate the mview before refresh.
Hi
The problem related to atomic_refresh is a bug. At page 474 in TOP (http://antognini.ch/top) I wrote the following text about this topic:
“In Oracle9i, because of bug 3168840, even if the parameter atomic_refresh is set to TRUE (the default in all versions), during a complete refresh of a single materialized view, a TRUNCATE statement is executed. If several materialized views are refreshed at the same time however, the refresh works atomically as expected. Therefore, to work around this bug, it is possible to create a refresh group containing the materialized view you have to refresh and a “dummy” materialized view that is created only to have a second one.”
Cheers,
Chris
Chris,
Thanks for your comments. Yes, this issue is because of bug 3168840. But I faced the issue as a result of bug fix :), where in atomic refresh for a large mview (~195 Gb) has generated lot of undo and had not completed deletion of data after more then 12 hours. I believe atomic_refresh=true parameter should be used judiciously and avoided for large mviews.
Regards
Amit