使用trace跟踪truncate的性能问题（思路）

oraunix 发表于 2010-11-23 13:57:46

本帖最后由 oraunix 于 2010-11-24 15:51 编辑

在ITPUB上看到有人提问，为什么TRUNCATE语句要比DELETE慢，由于默认TRUNCATE要回收空间，且会将所有的脏块写回到数据文件，因此其实对于小数据量而言，TRUNCATE比DELETE慢也是正常的。不过这个例子中，TRUNCATE慢的有点离谱了。

这是根据问题模拟的例子，表中只有5条记录，可以看到，TRUNCATE操作要比DELETE慢得多：
SQL> SET TIMING ON
SQL> DELETE T_BIG;

4 rows deleted.

Elapsed: 00:00:00.01
SQL> TRUNCATE TABLE T_BIG;

Table truncated.

Elapsed: 00:00:03.25

对于表中数据量不大的情况，TRUNCATE比DELETE慢一点是正常的，但是二者一般是同一个数量级的：
SQL> CREATE TABLE T_TRUNCATE (ID NUMBER);

Table created.

Elapsed: 00:00:00.06
SQL> INSERT INTO T_TRUNCATE VALUES (1);

1 row created.

Elapsed: 00:00:00.00
SQL> DELETE T_TRUNCATE;

1 row deleted.

Elapsed: 00:00:00.03
SQL> TRUNCATE TABLE T_TRUNCATE;

Table truncated.

Elapsed: 00:00:00.10

现在这个差别有点过大了，肯定不是简单的数据量较小就可以解释得通了。
这个问题的描述者除了给出TRUNCATE语句和DELETE语句的时间对比外，还对TRUNCATE语句设置了10046事件进行TRACE：
SQL> ALTER SESSION SET EVENTS '10046 TRACE NAME CONTEXT FOREVER, LEVEL 12';

Session altered.

Elapsed: 00:00:00.04
SQL> TRUNCATE TABLE T_BIG;

Table truncated.

Elapsed: 00:00:01.52
SQL> ALTER SESSION SET EVENTS '10046 TRACE NAME CONTEXT OFF';

Session altered.

Elapsed: 00:00:00.00

第二次运行TRUNCATE，虽然时间比第一次运行要短1/3，但是仍然要花费将近2秒的时间。
问题描述者通过TKPROF对得到的TRACE文件进行了格式化，并将格式化后的文件发到了论坛上，其中格式化后的结果类似于：
$ more trace_result.txt

TKPROF: Release10.2.0.3.0 - Production on星期四7月1 15:07:05 2010

Copyright (c) 1982, 2005, Oracle.All rights reserved.

Trace file: test08_ora_20627.trc
Sort options: default

********************************************************************************
count = number of times OCI procedure was executed
cpu    = cpu time in seconds executing
elapsed= elapsed time in seconds executing
disk = number of physical reads of buffers from disk
query = number of buffers gotten for consistent read
current= number of buffers gotten in current mode (usually for update)
rows = number of rows processed by the fetch or execute call
********************************************************************************

TRUNCATE TABLE T_BIG

call count    cpu elapsed    disk    query current    rows
------- -------------- ---------- ---------- ---------- --------------------
Parse    1    0.00    0.00       0       0       0       0
Execute    1    0.36    1.47    4453    365    9211       0
Fetch    0    0.00    0.00       0       0       0       0
------- -------------- ---------- ---------- ---------- --------------------
total    2    0.37    1.48    4453    365    9211       0

Misses in library cache during parse: 1
Optimizer mode: ALL_ROWS
Parsing user id: 74

Elapsed times include waiting on following events:
Event waited on                         Times Max. WaitTotal Waited
---------------------------------------- Waited----------------------
db file sequential read                   4452    0.00       0.06
reliable message                            5    0.00       0.00
enq: RO - fast object reuse                11    0.03       0.22
local write wait                         2229    0.00       0.67
log file sync                               1    0.00       0.00
SQL*Net message to client                   1    0.00       0.00
SQL*Net message from client                   1    10.76       10.76
********************************************************************************

.
.
.
********************************************************************************

OVERALL TOTALS FOR ALL NON-RECURSIVE STATEMENTS

call count    cpu elapsed    disk    query current    rows
------- -------------- ---------- ---------- ---------- --------------------
Parse    2    0.00    0.00       0       0       0       0
Execute    2    0.36    1.47    4453    365    9211       0
Fetch    0    0.00    0.00       0       0       0       0
------- -------------- ---------- ---------- ---------- --------------------
total    4    0.37    1.48    4453    365    9211       0

Misses in library cache during parse: 1

Elapsed times include waiting on following events:
Event waited on                         Times Max. WaitTotal Waited
---------------------------------------- Waited----------------------
SQL*Net message to client                   2    0.00       0.00
SQL*Net message from client                   2    10.76       16.52
db file sequential read                   4452    0.00       0.06
reliable message                            5    0.00       0.00
enq: RO - fast object reuse                11    0.03       0.22
local write wait                         2229    0.00       0.67
log file sync                               1    0.00       0.00

OVERALL TOTALS FOR ALL RECURSIVE STATEMENTS

call count    cpu elapsed    disk    query current    rows
------- -------------- ---------- ---------- ---------- --------------------
Parse    63    0.01    0.10       0       0       0       0
Execute 80    0.04    0.14       0    110       39       28
Fetch    36    0.00    0.02       1    246       0       29
------- -------------- ---------- ---------- ---------- --------------------
total    179    0.06    0.27       1    356       39       57

Misses in library cache during parse: 20
Misses in library cache during execute: 20

Elapsed times include waiting on following events:
Event waited on                         Times Max. WaitTotal Waited
---------------------------------------- Waited----------------------
db file sequential read                      1    0.01       0.01

10userSQL statements in session.
70internal SQL statements in session.
80SQL statements in session.
********************************************************************************
Trace file: test08_ora_20627.trc
Trace file compatibility:10.01.00
Sort options: default

   1session in tracefile.
   10userSQL statements in trace file.
   70internal SQL statements in trace file.
   80SQL statements in trace file.
   28unique SQL statements in trace file.
10848lines in trace file.
   12elapsed seconds in trace file.

根据格式化后的TRACE信息可以判断：
Oracle的递归SQL的运行时间只占了很小的比重，绝大部分时间都是消耗在TRUNCATE语句上。
TRUNCATE语句居然产生了4000多的物理读和9000多的逻辑读，而表本身一共也没有那么多的BLOCK存在，即使表本身比较大，在第一次TRUNCATE之后，高水位线已经被收缩，现在表本身应该已经很小才对。
而TRUNCATE语句的主要等待事件是db file sequential read和local write wait，这两个等待时间一般和索引有关，而和表似乎关系不大。
仅根据现有的信息无法判断具体的原因，因此要求问题的描述者提供进一步的信息，包含表结构的DBMS_METADATA输出结果以及原始的TRACE文件。
其中DBMS_METADATA.GET_DDL的输出类似如下：
SQL> SET LONG 10000
SQL> SELECT DBMS_METADATA.GET_DDL('TABLE', 'T_BIG') FROM DUAL;

DBMS_METADATA.GET_DDL('TABLE','T_BIG')
--------------------------------------------------------------------------------

CREATE TABLE "TEST"."T_BIG"
( "OWNER" VARCHAR2(30),
   "OBJECT_NAME" VARCHAR2(128),
   "SUBOBJECT_NAME" VARCHAR2(30),
   "OBJECT_ID" NUMBER,
   "DATA_OBJECT_ID" NUMBER,
   "OBJECT_TYPE" VARCHAR2(19),
   "CREATED" DATE,
   "LAST_DDL_TIME" DATE,
   "TIMESTAMP" VARCHAR2(19),
   "STATUS" VARCHAR2(7),
   "TEMPORARY" VARCHAR2(1),
   "GENERATED" VARCHAR2(1),
   "SECONDARY" VARCHAR2(1),
      PRIMARY KEY ("OBJECT_ID")
USING INDEX PCTFREE 10 INITRANS 2 MAXTRANS 255 COMPUTE STATISTICS
STORAGE(INITIAL 1073741824 NEXT 1048576 MINEXTENTS 1 MAXEXTENTS 2147483645
PCTINCREASE 0 FREELISTS 1 FREELIST GROUPS 1 BUFFER_POOL DEFAULT)
TABLESPACE "GPO"ENABLE
) PCTFREE 10 PCTUSED 40 INITRANS 1 MAXTRANS 255 NOCOMPRESS LOGGING
STORAGE(INITIAL 973078528 NEXT 1048576 MINEXTENTS 1 MAXEXTENTS 2147483645
PCTINCREASE 0 FREELISTS 1 FREELIST GROUPS 1 BUFFER_POOL DEFAULT)
TABLESPACE "GPO"

其中表和主键索引的初始化EXTENT的大小比较奇怪，既然表中记录很少，为什么表和主键的初始化INITIAL EXTENT会那么大。
进一步分析详细TRACE信息：
$ more test08_ora_20627.trc
/opt/ora10g/admin/test08/udump/test08_ora_20627.trc
Oracle Database10gEnterprise Edition Release10.2.0.3.0 - 64bit Production
With the Partitioning, OLAP and Data Mining options
ORACLE_HOME = /opt/ora10g/product/10.2.0/db_1
System name: Linux
Node name:    yans1
Release:    2.6.9-42.0.0.0.1.ELsmp
Version:    #1 SMP Sun Oct 15 15:13:57 PDT 2006
Machine:    x86_64
Instance name: test08
Redo thread mounted by this instance: 1
Oracle process number: 16
Unix process pid: 20627, image: oracle@yans1 (TNS V1-V3)

*** 2010-07-01 15:03:54.687
*** ACTION NAME:() 2010-07-01 15:03:54.673
*** MODULE NAME:(SQL*Plus) 2010-07-01 15:03:54.673
*** SERVICE NAME:(SYS$USERS) 2010-07-01 15:03:54.673
*** SESSION ID:(317.11354) 2010-07-01 15:03:54.673
WAIT #3: nam='SQL*Net message to client' ela= 2 driver id=1650815232 #bytes=1 p3=0 obj#=147960 tim=1248015463547954
WAIT #3: nam='SQL*Net message from client' ela= 5759303 driver id=1650815232 #bytes=1 p3=0 obj#=147960 tim=1248015469321192
XCTEND rlbk=0, rd_only=1
=====================
PARSING IN CURSOR #10 len=20 dep=0 uid=74 ct=85 lid=74 tim=1248015469332441 hv=3136993528 ad='ff547c80'
TRUNCATE TABLE T_BIG
END OF STMT
PARSE #10:c=1000,e=9511,p=0,cr=0,cu=0,mis=1,r=0,dep=0,og=1,tim=1248015469330783
BINDS #10:
=====================
.
.
.
EXEC #5:c=2000,e=2298,p=0,cr=2,cu=2,mis=1,r=1,dep=1,og=4,tim=1248015469436750
STAT #5 id=1 cnt=0 pid=0 pos=1 bj=0 p='UPDATEOBJ$ (cr=2 pr=0 pw=0 time=264 us)'
STAT #5 id=2 cnt=1 pid=1 pos=1 bj=37 p='INDEX RANGE SCAN I_OBJ2 (cr=2 pr=0 pw=0 time=21 us)'
WAIT #10: nam='db file sequential read' ela= 31 file#=51 block#=57365 blocks=1 obj#=147967 tim=1248015469436975
WAIT #10: nam='db file sequential read' ela= 14 file#=51 block#=57349 blocks=1 obj#=147967 tim=1248015469437058
WAIT #10: nam='db file sequential read' ela= 13 file#=51 block#=57350 blocks=1 obj#=147967 tim=1248015469437130
WAIT #10: nam='db file sequential read' ela= 13 file#=51 block#=57351 blocks=1 obj#=147967 tim=1248015469437182
WAIT #10: nam='db file sequential read' ela= 13 file#=51 block#=57352 blocks=1 obj#=147967 tim=1248015469437235
.
.
.
WAIT #10: nam='local write wait' ela= 466 file#=51 block#=57366 p3=0 obj#=147967 tim=1248015469563931
WAIT #10: nam='db file sequential read' ela= 21 file#=51 block#=57367 blocks=1 obj#=147967 tim=1248015469563995
WAIT #10: nam='local write wait' ela= 528 file#=51 block#=57367 p3=0 obj#=147967 tim=1248015469564616
WAIT #10: nam='db file sequential read' ela= 21 file#=51 block#=36885 blocks=1 obj#=147966 tim=1248015469564736
WAIT #10: nam='db file sequential read' ela= 18 file#=51 block#=36869 blocks=1 obj#=147966 tim=1248015469564800
WAIT #10: nam='db file sequential read' ela= 18 file#=51 block#=36870 blocks=1 obj#=147966 tim=1248015469564879
WAIT #10: nam='db file sequential read' ela= 20 file#=51 block#=36871 blocks=1 obj#=147966 tim=1248015469564939
.
.
.
WAIT #10: nam='db file sequential read' ela= 9 file#=51 block#=53256 blocks=1 obj#=147966 tim=1248015469688631
WAIT #10: nam='local write wait' ela= 249 file#=51 block#=53256 p3=0 obj#=147966 tim=1248015469688916
WAIT #10: nam='local write wait' ela= 263 file#=51 block#=36885 p3=0 obj#=147966 tim=1248015469689220
WAIT #10: nam='local write wait' ela= 259 file#=51 block#=36886 p3=0 obj#=147966 tim=1248015469689513
WAIT #10: nam='db file sequential read' ela= 11 file#=51 block#=36887 blocks=1 obj#=147966 tim=1248015469689562
WAIT #10: nam='local write wait' ela= 499 file#=51 block#=36887 p3=0 obj#=147966 tim=1248015469690139
WAIT #10: nam='db file sequential read' ela= 11 file#=51 block#=20493 blocks=1 obj#=147965 tim=1248015469690246
WAIT #10: nam='db file sequential read' ela= 12 file#=51 block#=20485 blocks=1 obj#=147965 tim=1248015469690301
WAIT #10: nam='db file sequential read' ela= 10 file#=51 block#=20486 blocks=1 obj#=147965 tim=1248015469690362
WAIT #10: nam='db file sequential read' ela= 10 file#=51 block#=20487 blocks=1 obj#=147965 tim=1248015469690415
.
.
.
WAIT #10: nam='local write wait' ela= 265 file#=53 block#=32262 p3=0 obj#=147965 tim=1248015469849141
WAIT #10: nam='db file sequential read' ela= 9 file#=54 block#=32261 blocks=1 obj#=147965 tim=1248015469849171
WAIT #10: nam='local write wait' ela= 257 file#=54 block#=32261 p3=0 obj#=147965 tim=1248015469849462
WAIT #10: nam='local write wait' ela= 279 file#=51 block#=20493 p3=0 obj#=147965 tim=1248015469849796
WAIT #10: nam='local write wait' ela= 471 file#=51 block#=20494 p3=0 obj#=147965 tim=1248015469850308
WAIT #10: nam='db file sequential read' ela= 10 file#=51 block#=20495 blocks=1 obj#=147965 tim=1248015469850356
WAIT #10: nam='local write wait' ela= 516 file#=51 block#=20495 p3=0 obj#=147965 tim=1248015469850948
WAIT #10: nam='db file sequential read' ela= 11 file#=51 block#=21 blocks=1 obj#=147964 tim=1248015469851055
WAIT #10: nam='db file sequential read' ela= 10 file#=51 block#=5 blocks=1 obj#=147964 tim=1248015469851119
WAIT #10: nam='db file sequential read' ela= 10 file#=51 block#=6 blocks=1 obj#=147964 tim=1248015469851183
WAIT #10: nam='db file sequential read' ela= 10 file#=51 block#=7 blocks=1 obj#=147964 tim=1248015469851232
WAIT #10: nam='db file sequential read' ela= 10 file#=51 block#=8 blocks=1 obj#=147964 tim=1248015469851277
.
.
.

可以看到，TRUNCATE语句也就是CURSOR 10在运行过程中，出现了大量的等待，等待事件主要就是前面提到的db file sequential read和local write wait，而等待的对象也在变化，OBJECT_ID从147967一直到147960，检查这些对象：
SQL> SELECT OBJECT_NAME, OBJECT_TYPE
2FROM DBA_OBJECTS
3WHERE OBJECT_ID IN (147967, 147966, 147965, 147964, 147963, 147962, 147961, 147960);

OBJECT_NAME                OBJECT_TYPE
------------------------------ -------------------
IND_BIG_TEMP                INDEX
IND_BIG_TYPE                INDEX
IND_BIG_CREATED             INDEX
IND_BIG_STATUS             INDEX
IND_BIG_ID                   INDEX
IND_BIG_NAME                INDEX
IND_BIG_OWNER                INDEX
T_BIG                      TABLE

8 rows selected.

显然这些索引就是T_BIG表上的索引，检查这些索引的大小：
SQL> SELECT SEGMENT_NAME, SUM(BYTES)/1024/1024/1024
2FROM USER_SEGMENTS
3WHERE SEGMENT_NAME IN
4(SELECT INDEX_NAME
5FROM USER_INDEXES
6WHERE TABLE_NAME = 'T_BIG')
7GROUP BY SEGMENT_NAME;

SEGMENT_NAME                SUM(BYTES)/1024/1024/1024
------------------------------ -------------------------
IND_BIG_ID                                           1
IND_BIG_OWNER                                        1
IND_BIG_CREATED                                     1
IND_BIG_NAME                                        1
IND_BIG_STATUS                                  1.0625
IND_BIG_TEMP                                  1.0625
IND_BIG_TYPE                                  1.0625

7 rows selected.

SQL> SELECT INDEX_NAME, INITIAL_EXTENT
2FROM USER_INDEXES
3WHERE TABLE_NAME = 'T_BIG';

INDEX_NAME                   INITIAL_EXTENT
------------------------------ --------------
IND_BIG_OWNER                   1073741824
IND_BIG_NAME                   1073741824
IND_BIG_ID                      1073741824
IND_BIG_STATUS                   1098907648
IND_BIG_CREATED                1073741824
IND_BIG_TYPE                   1098907648
IND_BIG_TEMP                   1098907648

7 rows selected.

这些索引基本上每个都是1G左右，且都是初始EXTENT的大小。显然导致问题的原因已经明确了，表包含了多个索引，且每个索引的初始段太大，因此TRUNCATE执行的时候对索引执行大量的db file sequence wait的操作，从而导致了TRUNCATE语句性能问题。
解决问题其实很简单，降低索引的初始EXTENT，就可以避免这个问题：
SQL> ALTER INDEX IND_BIG_OWNER REBUILD STORAGE (INITIAL1M);

Index altered.

SQL> ALTER INDEX IND_BIG_NAME REBUILD STORAGE (INITIAL1M);

Index altered.

SQL> ALTER INDEX IND_BIG_ID REBUILD STORAGE (INITIAL1M);

Index altered.

SQL> ALTER INDEX IND_BIG_STATUS REBUILD STORAGE (INITIAL1M);

Index altered.

SQL> ALTER INDEX IND_BIG_CREATED REBUILD STORAGE (INITIAL1M);

Index altered.

SQL> ALTER INDEX IND_BIG_TYPE REBUILD STORAGE (INITIAL1M);

Index altered.

SQL> ALTER INDEX IND_BIG_TEMP REBUILD STORAGE (INITIAL1M);

Index altered.

SQL> SET TIMING ON
SQL> TRUNCATE TABLE T_BIG;

Table truncated.

Elapsed: 00:00:00.27

重建索引后，TRUNCATE语句执行时间已经基本恢复正常。注意到表的初始化EXTENT也将近1G，对表进行MOVE缩小初始化EXTENT，还可以进一步提高TRUNCATE的效率：
SQL> ALTER TABLE T_BIG MOVE STORAGE (INITIAL1M);

Table altered.

Elapsed: 00:00:00.09
SQL> TRUNCATE TABLE T_BIG;

Table truncated.

Elapsed: 00:00:00.07

问题解决了，还是要提一句，真正导致问题的罪魁祸首是EXP的COMPRESS=Y，正是由于问题描述者在执行导出的时候没有设置COMPRESS=N，因此EXP尝试压缩表的EXTENT，即将表的全部EXTENT压缩到一个EXTENT中，因此在IMP导入到目标库的时候，就会像上面那样创建出初始EXTENT特别大的表和索引。

chenyu 发表于 2010-11-23 14:45:55

“真正导致问题的罪魁祸首是EXP的COMPRESS=Y，正是由于问题描述者在执行导出的时候没有设置COMPRESS=N，因此EXP尝试压缩表的EXTENT，即将表的全部EXTENT压缩到一个EXTENT中，因此在IMP导入到目标库的时候，就会像上面那样创建出初始EXTENT特别大的表和索引”
我的理解：这应该是在EXP之前create table的时侯指定的inititl太大导致的吧，即使不执行EXP在原库中truncate 那张表也应该很慢才对，所以真正的原因应该是创建表和索引时指定的初始区太大，和EXP应该没有关系，如果理解的不对请老师指正。

oraunix 发表于 2010-11-23 18:30:56

exp的时候选择了COMPRESS=Y的方式，这样在imp的时候，imp会将所有的内容imp到一个分区中，也就是第一个分区。
原表的第一个分区不一定很大。

oraunix 发表于 2010-11-23 18:32:32

exp中的COMPRESS=Y，这个特性的作用是，我们在导入的时候，可以将所有的数据导入到一个extent中，这样所有的数据块物理上是连续的，有利于表的全表扫描的性能。
因此有一些人会使用这个特性。

chenyu 发表于 2010-11-23 21:47:30

按照老师的描述我觉得加上COMPRESS=Y也不是坏事，既然带上COMPRESS=Y时imp可以将所有数据连续的存放到extent中，那为什么还会出现开始时描述的初始区很大的情况呢？这里面的原因我还是没想明白。

oraunix 发表于 2010-11-24 15:50:46

imp后，所有的数据都放在了一个extent里面，当然这个表的第一个extent会很大。
上面的实验中用的表是才用imp导入生成的。

页: [1]

iiDBA's Archiver

使用trace跟踪truncate的性能问题（思路）