kevin.zhang 发表于 2012-4-5 11:22:37

【案例分享1】ORA-1628: max # extents 32765 reached for rollback segmen

本帖最后由 kevin.zhang 于 2012-4-6 20:25 编辑

从即日起,会常发一些个人工作中遇到的案例和大家分享,欢迎讨论。
http://dbakevin.blogspot.mx/2012/03/ora-1628-max-extents-32765-reached-for.html

ORA-1628: max # extents 32765 reached for rollback segment _SYSSMU6_621766836$
One DB suddenly got many blow errors in a period of time:
Wed Mar 28 23:53:58 2012
ORA-1628: max # extents32765 reached for rollback segment _SYSSMU6_621766836$
Wed Mar 28 23:53:58 2012
ORA-1628: max # extents32765 reached for rollback segment _SYSSMU7_1070497209$

This error means an undo segment reached its max extents so that it couldn't extend any more:
SQL> select SEGMENT_NAME,MAX_EXTENTS from DBA_ROLLBACK_SEGS;
SEGMENT_NAME                   MAX_EXTENTS
------------------------------ -----------
_SYSSMU59_1810053014$                32765
_SYSSMU58_3572456133$                32765
_SYSSMU57_2783840325$                32765
_SYSSMU56_2339751936$                32765
_SYSSMU55_3550317550$                32765
..........

It is none business with the size of undo tablespace, infact there are plenty of free space:
Tablespace                  Free MB   Total MB      PCT
------------------------- ----------- ---------- ----------
UNDORBS                        21,483      24000    89.5125

The DB use Automatic Undo Management (AUM) :
SQL> show parameter undo
NAME TYPE VALUE
------------------------------------
undo_management string AUTO
undo_retention integer 10800
undo_tablespace string UNDORBS

Since it uses AUM, there is no way for us to turn the max_extents attribute:
SQL> select MAX_EXTENTS from DBA_ROLLBACK_SEGS where SEGMENT_NAME='_SYSSMU10$';
MAX_EXTENTS
-----------
32765

SQL> alter rollback segment "_SYSSMU10$" storage (MAXEXTENTs unlimited);
Rollback segment altered.

Above command was executed successful, but if we checked again, we can find the attribute still not changed:

SQL> select MAX_EXTENTS from DBA_ROLLBACK_SEGS where SEGMENT_NAME='_SYSSMU10$';
MAX_EXTENTS
-----------
32765

Even setting "_smu_debug_mode" is useless:
SQL> alter system set "_smu_debug_mode" = 4;
System altered.

SQL> alter rollback segment "_SYSSMU10$" storage (MAXEXTENTs unlimited);
ERROR at line 1:
ORA-25150: ALTERING of extent parameters not permitted

Pay attention to this, when get ORA-1628 with AUM, in most cases, the root cause is not simply the max extents attribute too small. In most cases it is related to some bug.

Let's check our case:
SQL> select to_char(begin_time,'MM/DD/YYYY HH24:MI') begin_time, UNXPSTEALCNT, EXPSTEALCNT , NOSPACEERRCNT, TUNED_UNDORETENTION from gv$undostat;
BEGIN_TIME       UNXPSTEALCNT EXPSTEALCNT NOSPACEERRCNT TUNED_UNDORETENTION
---------------- ------------ ----------- ------------- -------------------
03/30/2012 03:24            0         0             0            23941735
03/30/2012 03:14            0         0             0            23806293
03/30/2012 03:04            0         0             0            23670851
...........
...........
03/29/2012 00:14            0         0             0            42082827
03/29/2012 00:04            0         0             0            39077268
03/28/2012 23:54         1081      1081             5             9930112
03/28/2012 23:44            0         0             9          2367948800
03/28/2012 22:14            0         0             0                   0
03/28/2012 22:04            0         0             0          2346086400

From red part, we can obviously see the auto turned undo retention is not a normal value.

And pay attention the time point with those unusual retention time, the same time the DB got the ORA-1628 error.

And it make sense now. At that very period time, oracle miss judged a very big value for auto tuned undo retention time.
Hence make many commited undo blocks keeping un-expired, and force the rollback segments growing bigger and bigger, and at last leading to:
ORA-1628: max # extents 32765 reached for rollback segment _SYSSMU6_621766836$

OK, since we find out the root cause. The solution is simple, we can set below parameter to control the undo retention time:
SQL> alter system set "_smu_debug_mode" = 33554432;
System altered.

Let's check the undo retention time again:
SQL> select to_char(begin_time,'MM/DD/YYYY HH24:MI') begin_time, UNXPSTEALCNT, EXPSTEALCNT , NOSPACEERRCNT, TUNED_UNDORETENTION from gv$undostat;
BEGIN_TIME       UNXPSTEALCNT EXPSTEALCNT NOSPACEERRCNT TUNED_UNDORETENTION
---------------- ------------ ----------- ------------- -------------------
03/30/2012 03:44            0         0             0               10800
03/30/2012 03:34            0         0             0               10800
03/30/2012 03:24            0         0             0            23941735
03/30/2012 03:14            0         0             0            23806293
03/30/2012 03:04            0         0             0            23670851
...........
...........

PS:
更多案例欢迎访问个人blog:
http://dbakevin.blogspot.com

oraunix 发表于 2012-4-6 18:15:29

V$UNDOSTAT

V$UNDOSTAT displays a histogram of statistical data to show how well the system is working. The available statistics include undo space consumption, transaction concurrency, and length of queries executed in the instance. You can use this view to estimate the amount of undo space required for the current workload. Oracle uses this view to tune undo usage in the system. The view returns null values if the system is in manual undo management mode.

Each row in the view keeps statistics collected in the instance for a 10-minute interval. The rows are in descending order by the BEGIN_TIME column value. Each row belongs to the time interval marked by (BEGIN_TIME, END_TIME). Each column represents the data collected for the particular statistic in that time interval. The first row of the view contains statistics for the (partial) current time period. The view contains a total of 1008 rows, spanning a 7 day cycle.

Column                        Datatype                Description

BEGIN_TIME                DATE                        Identifies the beginning of the time interval

END_TIME                DATE                        Identifies the end of the time interval

UNDOTSN                        NUMBER                        Represents the last active undo tablespace in the duration of time. The tablespace ID of the active undo tablespace is returned in this column. If more than one undo tablespace was active in that period, the active undo tablespace that was active at the end of the period is reported.

UNDOBLKS                NUMBER                        Represents the total number of undo blocks consumed. You can use this column to obtain the consumption rate of undo blocks, and thereby estimate the size of the undo tablespace needed to handle the workload on your system.


TXNCOUNT                NUMBER                        Identifies the total number of transactions executed within the period



MAXQUERYLEN                NUMBER                        Identifies the length of the longest query (in seconds) executed in the instance during the period. You can use this statistic to estimate the proper setting of the UNDO_RETENTION initialization parameter. The length of a query is measured from the cursor open time to the last fetch/execute time of the cursor. Only the length of those cursors that have been fetched/executed during the period are reflected in the view.



MAXQUERYID                VARCHAR2(13)                SQL identifier of the longest running SQL statement in the period



MAXCONCURRENCY                NUMBER                        Identifies the highest number of transactions executed concurrently within the period

UNXPSTEALCNT                NUMBER                        Number of attempts to obtain undo space by stealing unexpired extents from other transactions

UNXPBLKRELCNT                NUMBER                        Number of unexpired blocks removed from certain undo segments so they can be used by other transactions

UNXPBLKREUCNT

NUMBER

Number of unexpired undo blocks reused by transactions



EXPSTEALCNT

NUMBER

Number of attempts to steal expired undo blocks from other undo segments



EXPBLKRELCNT

NUMBER

Number of expired undo blocks stolen from other undo segments



EXPBLKREUCNT

NUMBER

Number of expired undo blocks reused within the same undo segments



SSOLDERRCNT

NUMBER

Identifies the number of times the error ORA-01555 occurred. You can use this statistic to decide whether or not the UNDO_RETENTION initialization parameter is set properly given the size of the undo tablespace. Increasing the value of UNDO_RETENTION can reduce the occurrence of this error.



NOSPACEERRCNT

NUMBER

Identifies the number of times space was requested in the undo tablespace and there was no free space available. That is, all of the space in the undo tablespace was in use by active transactions. The corrective action is to add more space to the undo tablespace.



ACTIVEBLKS

NUMBER

Total number of blocks in the active extents of the undo tablespace for the instance at the sampled time in the period



UNEXPIREDBLKS

NUMBER

Total number of blocks in the unexpired extents of the undo tablespace for the instance at the sampled time in the period



EXPIREDBLKS

NUMBER

Total number of blocks in the expired extents of the undo tablespace for the instance at the sampled time in the period



TUNED_UNDORETENTION                NUMBER                System tuned value indicating the period for which undo is being retained

oraunix 发表于 2012-4-6 18:16:28

看一下这个还是对上面的而理解很有帮助的。
多看一下官方文档里面的对一些经典视图的解释。

oraunix 发表于 2012-4-6 18:17:14

再次支持一把。
页: [1]
查看完整版本: 【案例分享1】ORA-1628: max # extents 32765 reached for rollback segmen