经常看到有人提出这样的疑问,我在同一张表上建立了多个索引,为什么Oracle每次都选择一个,而不能同时利用多个索引呢。一般来说,常见的访问同一张表的两个以上索引,存在三种情况,AND-EQUAL、INDEX HASH JOIN和BITMAP INDEX AND/OR。
此外,还有一个设计上的疑问,如果有A、B、C三个字段,都可能作为查询条件,是建立多个复合索引好,还是建立三个单列的索引。这个问题之所以不好回答是因为和业务或者说和查询的模式有很大的关系,不过如果理解了Oracle什么时候会选择一个以上的索引来访问表,就会对于理解如何设计合理的索引有很大的帮助。
简单介绍一下AND-EQUAL执行计划。
首先建立一个测试表:
SQL> CREATE TABLE T_DOUBLE_IND
2 (ID NUMBER,
3 NAME VARCHAR2(30),
4 TYPE VARCHAR2(30),
5 CONTENTS VARCHAR2(4000));
Table created.
SQL> INSERT INTO T_DOUBLE_IND
2 SELECT ROWNUM,
3 OBJECT_NAME,
4 OBJECT_TYPE,
5 LPAD('A', 1000, 'A')
6 FROM DBA_OBJECTS;
75856 rows created.
SQL> COMMIT;
Commit complete.
SQL> CREATE INDEX IND_DOUBLE_NAME
2 ON T_DOUBLE_IND (NAME);
Index created.
SQL> CREATE INDEX IND_DOUBLE_TYPE
2 ON T_DOUBLE_IND (TYPE);
Index created.
现在建立了一个测试表和两个索引,下面首先来看看AND-EQUAL执行方式:
SQL> SELECT ID, NAME, TYPE
2 FROM T_DOUBLE_IND
3 WHERE NAME = 'T_DOUBLE_IND'
4 AND TYPE = 'TABLE';
ID NAME TYPE
---------- ------------------------------ ------------------------------
75688 T_DOUBLE_IND TABLE
Execution Plan
----------------------------------------------------------
Plan hash value: 474554719
-------------------------------------------------------
| Id | Operation | Name |
-------------------------------------------------------
| 0 | SELECT STATEMENT | |
| 1 | TABLE ACCESS BY INDEX ROWID| T_DOUBLE_IND |
| 2 | AND-EQUAL | |
|* 3 | INDEX RANGE SCAN | IND_DOUBLE_NAME |
|* 4 | INDEX RANGE SCAN | IND_DOUBLE_TYPE |
-------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
3 - access("NAME"='T_DOUBLE_IND')
4 - access("TYPE"='TABLE')
Note
-----
- rule based optimizer used (consider using cbo)
由于指定了两个列,且两个列上都包含索引,Oracle选择了扫描两个索引,并使用了AND-EQUAL执行计划。这种扫描方式是分别通过两个索引获取索引键值对应的ROWID,然后合并两个扫描中相等的ROWID,并通过这个ROWID来扫描表。
不过观察执行计划可以发现,当前的优化模式是RBO,如果收集一下索引,再次运行同样的查询:
SQL> SHOW PARAMETER OPTIMIZER_MODE
NAME TYPE VALUE
------------------------------------ ----------- ------------------------------
optimizer_mode string CHOOSE
SQL> EXEC DBMS_STATS.GATHER_TABLE_STATS(USER, 'T_DOUBLE_IND')
PL/SQL procedure successfully completed.
SQL> SELECT ID, NAME, TYPE
2 FROM T_DOUBLE_IND
3 WHERE NAME = 'T_DOUBLE_IND'
4 AND TYPE = 'TABLE';
ID NAME TYPE
---------- ------------------------------ ------------------------------
75688 T_DOUBLE_IND TABLE
Execution Plan
----------------------------------------------------------
Plan hash value: 3887138334
--------------------------------------------------------------------------------------------
| Id| Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------------------------
| 0| SELECT STATEMENT | | 1 | 37 | 3 (0)| 00:00:01 |
|* 1| TABLE ACCESS BY INDEX ROWID| T_DOUBLE_IND | 1 | 37 | 3 (0)| 00:00:01 |
|* 2| INDEX RANGE SCAN | IND_DOUBLE_NAME | 2 | | 1 (0)| 00:00:01 |
--------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter("TYPE"='TABLE')
2 - access("NAME"='T_DOUBLE_IND')
可以看到,收集统计信息后,优化模式变为CBO,但是Oracle只选择了一个索引进行扫描。对于当前的情况,NAME列的选择度非常高,因此这种方式的代价最低。
SQL> SELECT /*+ INDEX(A IND_DOUBLE_NAME) INDEX(A IND_DOUBLE_TYPE) */
2 ID, NAME, TYPE
3 FROM T_DOUBLE_IND
4 WHERE NAME = 'T_DOUBLE_IND'
5 AND TYPE = 'TABLE';
ID NAME TYPE
---------- ------------------------------ ------------------------------
75689 T_DOUBLE_IND TABLE
Execution Plan
----------------------------------------------------------
Plan hash value: 3887138334
--------------------------------------------------------------------------------------------
| Id| Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 37 | 3 (0)| 00:00:01 |
|*1 | TABLE ACCESS BY INDEX ROWID| T_DOUBLE_IND | 1 | 37 | 3 (0)| 00:00:01 |
|*2 | INDEX RANGE SCAN | IND_DOUBLE_NAME | 2 | | 1 (0)| 00:00:01 |
--------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter("TYPE"='TABLE')
2 - access("NAME"='T_DOUBLE_IND')
即使通过HINT指定两个索引,CBO优化器也只是会选择其中一个索引来进行扫描,而自动忽略另一个选择度低的索引。
只有使用AND_EQUAL提示,才能在CBO的情况下使用AND-EQUAL执行计划:
SQL> SELECT /*+ AND_EQUAL(A IND_DOUBLE_NAME IND_DOUBLE_TYPE) */ ID, NAME, TYPE
2 FROM T_DOUBLE_IND A
3 WHERE NAME = 'T_DOUBLE_IND'
4 AND TYPE = 'TABLE';
Execution Plan
----------------------------------------------------------
Plan hash value: 474554719
--------------------------------------------------------------------------------------------
| Id| Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 37 | 21 (0)| 00:00:01 |
|*1 | TABLE ACCESS BY INDEX ROWID| T_DOUBLE_IND | 1 | 37 | 21 (0)| 00:00:01 |
| 2 | AND-EQUAL | | | | | |
|*3 | INDEX RANGE SCAN | IND_DOUBLE_NAME | 2 | | 1 (0)| 00:00:01 |
|*4 | INDEX RANGE SCAN | IND_DOUBLE_TYPE | 9286 | | 20 (0)| 00:00:01 |
--------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter("NAME"='T_DOUBLE_IND' AND "TYPE"='TABLE')
3 - access("NAME"='T_DOUBLE_IND')
4 - access("TYPE"='TABLE')
在CBO下,Oracle不会自动选择AND-EQUAL执行计划,这是因为Oracle可以根据直方图来判断一个列上各个值的选择度,但是从统计信息无法获得两个不同的列合AND-EQUAL之后的选择度,这个操作后,如果得到的ROWID记录很少,那么这个查询的效率就会很高,如果AND-EQUAL后得到大量的ROWID,那么查询的效率就会很差,这时应该选择全表扫描或其他的执行计划。由于CBO都是根据统计信息分析得到的结果,而AND-EQUAL的结果对于CBO是未知的,因此CBO不会选择AND-EQAUL这种扫描方式。
如果要Oracle获得这种统计信息,最简单的方法就是建立一个复合索引,Oracle在分析索引列的时候自动会分析两个列的组合情况,从而在查询的时候可以准确的判断是否应该使用这个复合索引。
SQL> SET AUTOT OFF
SQL> CREATE INDEX IND_DOU_NAM_TYP
2 ON T_DOUBLE_IND
3 (NAME, TYPE);
Index created.
SQL> EXEC DBMS_STATS.GATHER_TABLE_STATS(USER, 'T_DOUBLE_IND')
PL/SQL procedure successfully completed.
SQL> SET AUTOT TRACE EXP
SQL> SELECT ID, NAME, TYPE
2 FROM T_DOUBLE_IND
3 WHERE NAME = 'T_DOUBLE_IND'
4 AND TYPE = 'TABLE';
Execution Plan
----------------------------------------------------------
Plan hash value: 1808990274
--------------------------------------------------------------------------------------------
| Id| Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 2 | 74 | 3 (0)| 00:00:01 |
| 1 | TABLE ACCESS BY INDEX ROWID| T_DOUBLE_IND | 2 | 74 | 3 (0)| 00:00:01 |
|*2 | INDEX RANGE SCAN | IND_DOU_NAM_TYP | 2 | | 1 (0)| 00:00:01 |
--------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access("NAME"='T_DOUBLE_IND' AND "TYPE"='TABLE')
SQL> SET AUTOT OFF
SQL> UPDATE T_DOUBLE_IND
2 SET NAME = 'T_DOUBLE_IND'
3 WHERE ID <= 20000;
20000 rows updated.
SQL> COMMIT;
Commit complete.
SQL> EXEC DBMS_STATS.GATHER_TABLE_STATS(USER, 'T_DOUBLE_IND')
PL/SQL procedure successfully completed.
SQL> SET AUTOT TRACE EXP
SQL> SELECT ID, NAME, TYPE
2 FROM T_DOUBLE_IND
3 WHERE NAME = 'T_DOUBLE_IND'
4 AND TYPE = 'TABLE';
Execution Plan
----------------------------------------------------------
Plan hash value: 1808990274
--------------------------------------------------------------------------------------------
| Id| Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 2277 | 77418 | 1225 (1)| 00:00:18 |
| 1 | TABLE ACCESS BY INDEX ROWID| T_DOUBLE_IND | 2277 | 77418 | 1225 (1)| 00:00:18 |
|*2 | INDEX RANGE SCAN | IND_DOU_NAM_TYP | 2277 | | 11 (0)| 00:00:01 |
--------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access("NAME"='T_DOUBLE_IND' AND "TYPE"='TABLE')
SQL> SET AUTOT OFF
SQL> UPDATE T_DOUBLE_IND
2 SET TYPE = 'TABLE'
3 WHERE ID <= 20000;
20000 rows updated.
SQL> COMMIT;
Commit complete.
SQL> EXEC DBMS_STATS.GATHER_TABLE_STATS(USER, 'T_DOUBLE_IND')
PL/SQL procedure successfully completed.
SQL> SET AUTOT TRACE EXP
SQL> SELECT ID, NAME, TYPE
2 FROM T_DOUBLE_IND
3 WHERE NAME = 'T_DOUBLE_IND'
4 AND TYPE = 'TABLE';
Execution Plan
----------------------------------------------------------
Plan hash value: 4134729579
----------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
----------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 7778 | 258K| 2010 (1)| 00:00:29 |
|* 1 | TABLE ACCESS FULL| T_DOUBLE_IND | 7778 | 258K| 2010 (1)| 00:00:29 |
----------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter("NAME"='T_DOUBLE_IND' AND "TYPE"='TABLE')
在建立复合索引后,对于刚才的查询,Oracle会自动选择复合索引。即使将表中的20000条记录全部更新为NAME=’T_DOUBLE_BIN’,由于TYPE列的选项条件,在加上索引相对于表来说要窄得多,Oracle仍然选择了复合索引扫描。
如果将TYPE列也更新20000条记录,使得NAME=’T_DOUBLE_IND’和TYPE=’TABLE’的选择性很差,这时Oracle就会自动选择全表扫描来代替索引扫描。
|
|