PostgreSQL数据库SQL优化案例:从2秒到2毫秒

零 背景说明

这是生产环境下,一则PostgreSQL数据库SQL优化的案例小结:单条SQL执行时间,从优化前2秒到优化后,降低到2毫秒,性能提升1000倍。数据库版本为PostgreSQL 9.3。原始SQL语句如下:

SELECT t.id,
       t.code,
       t.name,
       t.level,
       t.IS_CLOSED isClosed,
       t.PARENT_ID parentId,
       t.dir,
       t.remark
FROM t_report_type t
WHERE 1=1
  AND t.code IN
    (SELECT report_type
     FROM t_report_temp
     WHERE corp_id IS NULL
       AND partner_id IS NULL)
ORDER BY t.code

优化前执行计划如下:

EXPLAIN ANALYZE
SELECT t.id,
       t.code,
       t.name,
       t.level,
       t.IS_CLOSED isClosed,
       t.PARENT_ID parentId,
       t.dir,
       t.remark
FROM t_report_type t
WHERE 1=1
  AND t.code IN
    (SELECT report_type
     FROM t_report_temp
     WHERE corp_id IS NULL
       AND partner_id IS NULL)
ORDER BY t.code

┌────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐
│                                                                       QUERY PLAN                                                                       │
├────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┤
│ Sort  (cost=3179.42..3179.93 rows=204 width=165) (actual time=2179.224..2179.250 rows=383 loops=1)                                                     │
│   Sort Key: t.code                                                                                                                                     │
│   Sort Method: quicksort  Memory: 126kB                                                                                                                │
│   ->  Nested Loop Semi Join  (cost=0.55..3171.59 rows=204 width=165) (actual time=26.092..2178.526 rows=383 loops=1)                                   │
│         ->  Seq Scan on t_report_type t  (cost=0.00..16.59 rows=459 width=165) (actual time=0.002..0.187 rows=459 loops=1)                             │
│         ->  Index Only Scan using t_report_temp_01 on t_report_temp  (cost=0.55..1873.80 rows=428 width=7) (actual time=4.744..4.744 rows=1 loops=459) │
│               Index Cond: ((corp_id IS NULL) AND (partner_id IS NULL) AND (report_type = (t.code)::text))                                              │
│               Heap Fetches: 53                                                                                                                         │
│ Total runtime: 2179.301 ms                                                                                                                             │
└────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘
(9 rows)

一 分析问题

结合执行来分析SQL语句,查看表结构,数据量分布:

zyd=> \d t_report_type
                 Table "zyd.t_report_type"
┌───────────┬────────────────────────┬─────────────────────┐
│  Column   │          Type          │      Modifiers      │
├───────────┼────────────────────────┼─────────────────────┤
│ id        │ character(32)          │ not null            │
│ code      │ character varying(32)  │ not null            │
│ name      │ character varying(256) │ not null            │
│ parent_id │ character varying(32)  │                     │
│ dir       │ character varying(256) │                     │
│ level     │ smallint               │                     │
│ is_closed │ character(1)           │ default '0'::bpchar │
│ remark    │ character varying(200) │                     │
└───────────┴────────────────────────┴─────────────────────┘
Indexes:
    "t_report_type_pkey" PRIMARY KEY, btree (id)
    "t_report_type_01" btree (code, parent_id, dir)

zyd=> select count(*) from t_report_type;
┌───────┐
│ count │
├───────┤
│   459 │
└───────┘
(1 row)

zyd=> select count(*) from t_report_temp;
┌────────┐
│ count  │
├────────┤
│ 366132 │
└────────┘
(1 row)

zyd=> \d t_report_temp
                 Table "zyd.t_report_temp"
┌─────────────┬────────────────────────────────┬───────────┐
│   Column    │              Type              │ Modifiers │
├─────────────┼────────────────────────────────┼───────────┤
│ id          │ character(32)                  │ not null  │
│ report_code │ character varying(40)          │ not null  │
│ report_name │ character varying(100)         │ not null  │
│ report_type │ character varying(32)          │ not null  │
│ corp_id     │ character(32)                  │           │
│ partner_id  │ character(32)                  │           │
│ industry_id │ character(32)                  │           │
│ is_default  │ character(1)                   │           │
│ remark      │ character varying(200)         │           │
│ source_type │ character varying(255)         │           │
│ is_show     │ character(1)                   │ default 0 │
│ create_ts   │ timestamp(6) without time zone │           │
│ update_ts   │ timestamp(6) without time zone │           │
│ is_simple   │ character(1)                   │           │
└─────────────┴────────────────────────────────┴───────────┘
Indexes:
    "t_report_temp_pkey" PRIMARY KEY, btree (id)
    "t_report_temp_01" btree (corp_id, industry_id, partner_id, report_code, report_type, report_name)
    "t_report_temp_02" btree (industry_id, partner_id, report_type)
    "t_report_temp_03" btree (industry_id, report_type)
    "t_report_temp_04" btree (report_name, corp_id, report_code)
    "t_report_temp_index_1" btree (corp_id, report_code)

zyd=> select count(*) from t_report_temp where corp_id is null and partner_id is null;
┌───────┐
│ count │
├───────┤
│  1126 │
└───────┘
(1 row)

zyd=>

看到,这是一个简单的2张表做join连接的SQL查询语句,优化器选择了嵌套循环nested loop的关联方式,其中一张表t_report_type为小表,数据量为459,优化器选择全表扫描的方式获取数据,另外一张表t_report_temp数据量相对多一些,366132条数据,但是满足条件的数据却只有1126条,同时选择了index only scan的方式,单次执行耗时4.744毫秒。

这是从执行计划看到的基本信息,能优化的突破口基本也就是对于表的访问能不能更快?虽然这里选择的是index only scan的方式。

三 解决问题

先尝试重新收集表的统计信息,再查看其执行计划是否有好转?

zyd=> analyze verbose t_report_temp;
INFO:  analyzing "zyd.t_report_temp"
INFO:  "t_report_temp": scanned 9494 of 9494 pages, containing 366132 live rows and 1345 dead rows; 30000 rows in sample, 366132 estimated total rows
ANALYZE
zyd=> analyze verbose t_report_type;
INFO:  analyzing "zyd.t_report_type"
INFO:  "t_report_type": scanned 12 of 12 pages, containing 459 live rows and 0 dead rows; 459 rows in sample, 459 estimated total rows
ANALYZE
zyd=>
...查看执行计划
...
┌────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐
│                                                                       QUERY PLAN                                                                       │
├────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┤
│ Sort  (cost=3179.42..3179.93 rows=204 width=165) (actual time=1973.808..1973.836 rows=383 loops=1)                                                     │
│   Sort Key: t.code                                                                                                                                     │
│   Sort Method: quicksort  Memory: 126kB                                                                                                                │
│   ->  Nested Loop Semi Join  (cost=0.55..3171.59 rows=204 width=165) (actual time=29.583..1973.127 rows=383 loops=1)                                   │
│         ->  Seq Scan on t_report_type t  (cost=0.00..16.59 rows=459 width=165) (actual time=0.002..0.175 rows=459 loops=1)                             │
│         ->  Index Only Scan using t_report_temp_01 on t_report_temp  (cost=0.55..1873.80 rows=428 width=7) (actual time=4.297..4.297 rows=1 loops=459) │
│               Index Cond: ((corp_id IS NULL) AND (partner_id IS NULL) AND (report_type = (t.code)::text))                                              │
│               Heap Fetches: 53                                                                                                                         │
│ Total runtime: 1973.885 ms                                                                                                                             │
└────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘
(9 rows)

zyd=>

发现,并没有好转。观察到其执行计划中,对于相对大的那个表t_report_temp,优化器选择了通过t_report_temp_01这个index来做的index only scan,每次耗时4毫秒左右,但是loops=459,累积起来就1800多毫秒了。

同时,该indext_report_temp_01是一个联合index,基于(corp_id, industry_id, partner_id, report_code, report_type, report_name) 6个字段创建的联合index,观察表和该index的大小:

zyd=> \dt+ t_report_temp
                        List of relations
┌────────┬───────────────┬───────┬───────┬───────┬──────────────┐
│ Schema │     Name      │ Type  │ Owner │ Size  │ Description  │
├────────┼───────────────┼───────┼───────┼───────┼──────────────┤
│ zyd    │ t_report_temp │ table │ zyd   │ 74 MB │ 报表模板主表 │
└────────┴───────────────┴───────┴───────┴───────┴──────────────┘
(1 row)

zyd=> \di+ t_report_temp_01
                                 List of relations
┌────────┬──────────────────┬───────┬───────┬───────────────┬───────┬─────────────┐
│ Schema │       Name       │ Type  │ Owner │     Table     │ Size  │ Description │
├────────┼──────────────────┼───────┼───────┼───────────────┼───────┼─────────────┤
│ zyd    │ t_report_temp_01 │ index │ zyd   │ t_report_temp │ 80 MB │             │
└────────┴──────────────────┴───────┴───────┴───────────────┴───────┴─────────────┘
(1 row)

zyd=>

发现,这个index 80MB,比表本身74MB还大。是不是推断出,此种情况下,优化器选择通过该index来获取数据不够好呢,或者说还不如直接扫描这个表呢?

开始验证:

zyd=> set enable_indexscan =off;
SET
zyd=> set enable_indexonlyscan =off;
SET
zyd=> set enable_bitmapscan =off;
SET
zyd=> explain analyze..
..
..
┌─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐
│                                                             QUERY PLAN                                                              │
├─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┤
│ Merge Join  (cost=13420.55..13423.58 rows=204 width=165) (actual time=69.095..69.417 rows=383 loops=1)                              │
│   Merge Cond: ((t.code)::text = (t_report_temp.report_type)::text)                                                                  │
│   ->  Sort  (cost=36.88..38.03 rows=459 width=165) (actual time=0.513..0.569 rows=459 loops=1)                                      │
│         Sort Key: t.code                                                                                                            │
│         Sort Method: quicksort  Memory: 145kB                                                                                       │
│         ->  Seq Scan on t_report_type t  (cost=0.00..16.59 rows=459 width=165) (actual time=0.002..0.121 rows=459 loops=1)          │
│   ->  Sort  (cost=13383.67..13383.79 rows=49 width=7) (actual time=68.576..68.600 rows=383 loops=1)                                 │
│         Sort Key: t_report_temp.report_type                                                                                         │
│         Sort Method: quicksort  Memory: 42kB                                                                                        │
│         ->  HashAggregate  (cost=13381.80..13382.29 rows=49 width=7) (actual time=68.253..68.309 rows=383 loops=1)                  │
│               ->  Seq Scan on t_report_temp  (cost=0.00..13163.63 rows=87269 width=7) (actual time=0.005..67.787 rows=1126 loops=1) │
│                     Filter: ((corp_id IS NULL) AND (partner_id IS NULL))                                                            │
│                     Rows Removed by Filter: 365114                                                                                  │
│ Total runtime: 69.480 ms                                                                                                            │
└─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘
(14 rows)

zyd=> 

此时,可以看到,当我们禁用index only scan,index scan,bitmap index scan之后,使优化器对于表t_report_temp选择全表扫描之后,整个SQL的执行效率反而更好。
于是,尝试在(report_type,corp_id,partner_id)创建联合index,观察一下SQL效率:

zyd=> create index CONCURRENTLY idx_3 on t_report_temp(report_type,corp_id,partner_id);
CREATE INDEX
zyd=> explain analyze ....
..
┌───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐
│                                                                QUERY PLAN                                                                 │
├───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┤
│ Sort  (cost=267.54..268.05 rows=204 width=165) (actual time=2.525..2.554 rows=383 loops=1)                                                │
│   Sort Key: t.code                                                                                                                        │
│   Sort Method: quicksort  Memory: 126kB                                                                                                   │
│   ->  Nested Loop Semi Join  (cost=0.42..259.72 rows=204 width=165) (actual time=0.041..2.083 rows=383 loops=1)                           │
│         ->  Seq Scan on t_report_type t  (cost=0.00..16.59 rows=459 width=165) (actual time=0.002..0.053 rows=459 loops=1)                │
│         ->  Index Only Scan using idx_3 on t_report_temp  (cost=0.42..29.58 rows=428 width=7) (actual time=0.004..0.004 rows=1 loops=459) │
│               Index Cond: ((report_type = (t.code)::text) AND (corp_id IS NULL) AND (partner_id IS NULL))                                 │
│               Heap Fetches: 27                                                                                                            │
│ Total runtime: 2.600 ms                                                                                                                   │
└───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘
(9 rows)

zyd=>

发现,此时SQL的执行耗时从之前的2秒降低到2.6毫秒。

四 小结

通过深入分析SQL执行计划、结合表结构、数据量及数据分布,是做好SQL优化的基本出发点。需要更为深入学习和分析SQL执行计划,能看懂SQL执行计划,对SQL执行计划有感觉,多看多读执行计划,多思考。

PostgreSQL官方文档中,关于执行计划描述,有这么一句话:

Plan-reading is an art that requires some experience to master, but this section attempts to cover the basics.

另辟蹊径的一则PostgreSQL数据库SQL优化案例

一 问题现象

早上,收到项目组诉求,业务系统的某个模块从昨天下午3:30左右出现有数据插入失败的情况。希望能从数据库端或者服务器上看看有没有异常现象,并给予支持。

登录到数据库控制台上,经过初步排查发现,看到下述现象:

 

从昨天上午10:29开始,系统频繁出现类似插入SQL,且每次执行耗时1秒。更为糟糕的是,该类型SQL一直持续到今天早上,不停的在执行。

二 分析问题

拎出其中一条SQL进行分析

INSERT
INTO t_ai_prd_history_effective_record
  (
    ID,
    create_time,
    cust_category,
    cust_id,
    cust_name,
    company_id,
    company_name,
    template_no,
    template_name,
    template_field_identifier,
    template_field_name,
    submit_value
  )
SELECT random_string (32),
  now(),
  $1,
  $2,
  $3,
  $4,
  $5,
  $6,
  $7,
  $8,
  $9,
  $10
WHERE NOT EXISTS
  (SELECT 1
  FROM t_ai_prd_history_effective_record r
  WHERE COALESCE (r.cust_category, '')           = COALESCE ($11, '')
  AND COALESCE (r.cust_id, '')                   = COALESCE ($12, '')
  AND COALESCE (r.company_id, '')                = COALESCE ($13, '')
  AND COALESCE (r.template_no, '')               = COALESCE ($14, '')
  AND COALESCE (r.template_field_identifier, '') = COALESCE ($15,'')
  AND COALESCE (r.submit_value, '')              = COALESCE ($16, '')
  )

对应的参数为:

params: $1                                   = 'MY001',
  $2                                             = 'b8168c7513014b0c9769f3d61574833d',
  $3                                             = 'WP MANAGEMENT SERVICES',
  $4                                             = '1BABS7HSS5UH01FE140A000085535171',
  $5                                             = 'KLINIK PERGIGIAN M DENTAL',
  $6                                             = 'MYBL0303',
  $7                                             = 'Expenses',
  $8                                             = 'InvoiceDate',
  $9                                             = 'InvoiceDate(发票日期)',
  $10                                            = '20170614',
  $11                                            = 'MY001',
  $12                                            = 'b8168c7513014b0c9769f3d61574833d',
  $13                                            = '1BABS7HSS5UH01FE140A000085535171',
  $14                                            = 'MYBL0303',
  $15                                            = 'InvoiceDate',
  $16                                            = '20170614'

显然,该SQL的目的是要向表t_ai_prd_history_effective_record进行有条件的插入数据,其条件是表中不存在

COALESCE (r.cust_category, '')           = COALESCE ('DZ001', '')
ai->   AND COALESCE (r.cust_id, '')                   = COALESCE ('b151ad4f86ab4ec5aee8c4cc377e9eb7', '')
ai->   AND COALESCE (r.company_id, '')                = COALESCE ('04cb580238dc49af8bfb46e00e959a1a', '')
ai->   AND COALESCE (r.template_no, '')               = COALESCE ('KJDZ0101', '')
ai->   AND COALESCE (r.template_field_identifier, '') = COALESCE ('ItemQuantity','')
ai->   AND COALESCE (r.submit_value, '')              = COALESCE ('10100$__$6080$__$$__$$__$', '');

的情况下。具体体现在SQL中的WHERE NOT EXISTS。

那么,问题就明了了,应该是出现在NOT EXISTS的子查询上。

接下来,单独看看这个子查询的执行计划:

 ai=> explain analyze SELECT 1
ai->   FROM t_ai_prd_history_effective_record r
ai->   WHERE COALESCE (r.cust_category, '')           = COALESCE ('DZ001', '')
ai->   AND COALESCE (r.cust_id, '')                   = COALESCE ('b151ad4f86ab4ec5aee8c4cc377e9eb7', '')
ai->   AND COALESCE (r.company_id, '')                = COALESCE ('04cb580238dc49af8bfb46e00e959a1a', '')
ai->   AND COALESCE (r.template_no, '')               = COALESCE ('KJDZ0101', '')
ai->   AND COALESCE (r.template_field_identifier, '') = COALESCE ('ItemQuantity','')
ai->   AND COALESCE (r.submit_value, '')              = COALESCE ('10100$__$6080$__$$__$$__$', '');
                                                                                                                                                                                                                                                      QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Seq Scan on t_ai_prd_history_effective_record r  (cost=0.00..203817.53 rows=1 width=0) (actual time=1266.116..1267.093 rows=1 loops=1)
   Filter: (((COALESCE(cust_category, ''::character varying))::text = 'DZ001'::text) AND ((COALESCE(cust_id, ''::character varying))::text = 'b151ad4f86ab4ec5aee8c4cc377e9eb7'::text) AND ((COALESCE(company_id, ''::character varying))::text = '04cb580238dc49af8bfb46e00e959a1a'::text) AND ((COALESCE(template_no, ''::character varying))::text = 'KJDZ0101'::text) AND ((COALESCE(template_field_identifier, ''::character varying))::text = 'ItemQuantity'::text) AND (COALESCE(submit_value, ''::text) = '10100$__$6080$__$$__$$__$'::text))
   Rows Removed by Filter: 3193300
 Planning time: 0.129 ms
 Execution time: 1267.133 ms
(5 rows)
ai=>

t_ai_prd_history_effective_record表数据量:

ai=> select count(*) from t_ai_prd_history_effective_record;
  count 
---------
 3193138
(1 row)
ai=>

意料之中,每次执行这个去重的不存在判断,都对这个拥有300多万条数据的t_ai_prd_history_effective_record表执行一次全表扫描,可想而知,执行效率肯定低下,每次耗时1秒多,且随着该表的数据不断增多,插入的速度将变得越来越慢。

这里,可以说是,每一条记录的插入,都将导致下一次插入变得更慢!

三 解决方案

找到了问题的症结,就要着手优化了。终极目标就是怎么可以绕过那个NOT exists的子查询?

经过和BPO部门林朝荣同学讨论,给出下述方案:

  1. 能不能对于t_ai_prd_history_effective_record表的子查询的那6个字段上建立一个唯一性索引,如果可以的话,那么我们就在插入的时候不再需要那个where条件判断去重了,而由这个唯一性索引的约束去做判断,如果插入重复数据的话,就违反约束,报错抛出异常。这样的话,看似解决了这个问题,但是,如果从业务上考虑的话,的确是存在过多的重复数据,如果报错的话,业务人员会在系统上频繁的收到报错信息,不是很友好。
  2. 既然这样的话,那问题就变的更简单了。直接粗暴的把SQL的where not exists的去重判断删掉,先一股脑儿的插入,不论是否重复,然后,写个批处理的SQL,放到晚上再统一把重复的记录删掉,问题不就解决了嘛。

给出一个删除PostgreSQL数据库表重复记录的示例:

ai=> create table t(id int primary key,name varchar(10),addr varchar(10));
CREATE TABLE
ai=> insert into t values(1,'onlyou.com','xiamen');
INSERT 0 1
ai=> insert into t values(2,'apple.com','usa');   
INSERT 0 1
ai=> insert into t values(3,'apple.com','usa');
INSERT 0 1
ai=> insert into t values(4,'google','usa');
INSERT 0 1
ai=> insert into t values(5,'google','usa');
INSERT 0 1
ai=> select * from t;
 id |    name    |  addr 
----+------------+--------
  1 | onlyou.com | xiamen
  2 | apple.com  | usa
  3 | apple.com  | usa
  4 | google     | usa
  5 | google     | usa
(5 rows)
ai=> select t1.id from t t1 where t1.ctid<>(select max(t2.ctid) from t t2 where t1.name=t2.name and t1.addr=t2.addr);
 id
----
  2
  4
(2 rows)
ai=> delete from t where id in(select t1.id from t t1 where t1.ctid<>(select max(t2.ctid) from t t2 where t1.name=t2.name and t1.addr=t2.addr));
DELETE 2
ai=> select * from t;                                                                                                                          
 id |    name    |  addr 
----+------------+--------
  1 | onlyou.com | xiamen
  3 | apple.com  | usa
  5 | google     | usa
(3 rows)
ai=>

四 小结

发现并找到慢SQL的问题所在,进行有针对性的优化改造,解决问题往往不一定难,深入思考一下多尝试。

以上,给出一个优化的思路和方向。

如何在PostgreSQL中使用partial index来优化SQL?

一 案例背景

从生产控制台上看到一条下述SQL:

格式化之后SQL语句为:

select '01' status, count(1) from t_ai_prd_item where status = '01' and deleted = false	
union all 
select '02' status, count(1) from t_ai_prd_item where status = '02' and deleted = false

一个union all的联合查询,每次执行耗时1秒。有没有优化余地呢?

 二 优化分析

下述通过copy原表t_ai_prd_item为t1,来进行分析优化。

1 看SQL执行计划及数据量分布:

ai=> explain analyze select '01' status, count(1) from t1 where status = '01' and deleted = false 
ai-> union all 
ai-> select '02' status, count(1) from t1 where status = '02' and deleted = false;
                                                         QUERY PLAN                                                          
-----------------------------------------------------------------------------------------------------------------------------
 Append  (cost=160483.13..320974.20 rows=2 width=40) (actual time=5211.374..6250.940 rows=2 loops=1)
   ->  Aggregate  (cost=160483.13..160483.14 rows=1 width=0) (actual time=5211.374..5211.374 rows=1 loops=1)
         ->  Seq Scan on t1  (cost=0.00..160483.12 rows=1 width=0) (actual time=5211.369..5211.369 rows=0 loops=1)
               Filter: ((NOT deleted) AND ((status)::text = '01'::text))
               Rows Removed by Filter: 1395189
   ->  Aggregate  (cost=160491.03..160491.04 rows=1 width=0) (actual time=1039.563..1039.563 rows=1 loops=1)
         ->  Seq Scan on t1 t1_1  (cost=0.00..160483.12 rows=3163 width=0) (actual time=901.577..1039.307 rows=2835 loops=1)
               Filter: ((NOT deleted) AND ((status)::text = '02'::text))
               Rows Removed by Filter: 1392354
 Planning time: 0.417 ms
 Execution time: 6251.024 ms
(11 rows)

ai=> select count(*) from t_ai_prd_item;                       
  count  
---------
 1395189
(1 row)

ai=> select count(*),status from t_ai_prd_item group by status;
  count  | status 
---------+--------
     364 | 04
      25 | 05
    2835 | 02
 1391965 | 06
(4 rows)

ai=>

分析发现,表1中有1395189条记录,status=’01’的记录为0,status=’02’的记录为2835条。

2 那么在status字段上建立一个btree index,效果会怎么样呢?

ai=> create index idx_status on t1(status);
CREATE INDEX
ai=> explain analyze select '01' status, count(1) from t1 where status = '01' and deleted = false 
ai-> union all 
ai-> select '02' status, count(1) from t1 where status = '02' and deleted = false;
                                                               QUERY PLAN                                                               
----------------------------------------------------------------------------------------------------------------------------------------
 Append  (cost=8.45..730.24 rows=2 width=40) (actual time=0.037..3.999 rows=2 loops=1)
   ->  Aggregate  (cost=8.45..8.46 rows=1 width=0) (actual time=0.036..0.036 rows=1 loops=1)
         ->  Index Scan using idx_status on t1  (cost=0.43..8.45 rows=1 width=0) (actual time=0.033..0.033 rows=0 loops=1)
               Index Cond: ((status)::text = '01'::text)
               Filter: (NOT deleted)
   ->  Aggregate  (cost=721.76..721.77 rows=1 width=0) (actual time=3.962..3.962 rows=1 loops=1)
         ->  Index Scan using idx_status on t1 t1_1  (cost=0.43..714.10 rows=3063 width=0) (actual time=0.029..3.673 rows=2835 loops=1)
               Index Cond: ((status)::text = '02'::text)
               Filter: (NOT deleted)
 Planning time: 0.291 ms
 Execution time: 4.067 ms
(11 rows)

ai=>

效果不错,但是结合到实际业务场景,此处只是针对status=’01’和status=’02’的情况来做统计。那么有没有更好的方法来解决这个场景呢?

3 创建1个partial index来测试

ai=> create index idx_partial on t1(status) where (status='01' or status='02');
CREATE INDEX
ai=> explain analyze select '01' status, count(1) from t1 where status = '01' and deleted = false 
ai-> union all 
ai-> select '02' status, count(1) from t1 where status = '02' and deleted = false;
                                                               QUERY PLAN                                                               
----------------------------------------------------------------------------------------------------------------------------------------
 Append  (cost=8.30..732.19 rows=2 width=40) (actual time=0.019..3.916 rows=2 loops=1)
   ->  Aggregate  (cost=8.30..8.31 rows=1 width=0) (actual time=0.018..0.018 rows=1 loops=1)
         ->  Index Scan using idx_partial on t1  (cost=0.28..8.30 rows=1 width=0) (actual time=0.015..0.015 rows=0 loops=1)
               Index Cond: ((status)::text = '01'::text)
               Filter: (NOT deleted)
   ->  Aggregate  (cost=723.85..723.86 rows=1 width=0) (actual time=3.897..3.897 rows=1 loops=1)
         ->  Index Scan using idx_partial on t1 t1_1  (cost=0.28..716.20 rows=3063 width=0) (actual time=0.030..3.599 rows=2835 loops=1)
               Index Cond: ((status)::text = '02'::text)
               Filter: (NOT deleted)
 Planning time: 0.234 ms
 Execution time: 3.992 ms
(11 rows)

ai=>

对比,发现SQL的执行效率几乎没有差别。但是:

ai=> \di+ idx_status 
                         List of relations
 Schema |    Name    | Type  | Owner | Table | Size  | Description 
--------+------------+-------+-------+-------+-------+-------------
 ai     | idx_status | index | ai    | t1    | 30 MB | 
(1 row)

ai=> \di+ idx_partial 
                         List of relations
 Schema |    Name     | Type  | Owner | Table | Size  | Description 
--------+-------------+-------+-------+-------+-------+-------------
 ai     | idx_partial | index | ai    | t1    | 80 kB | 
(1 row)

ai=>

4 小结:

在确保SQL执行效率的同时,这个partial index所占的存储空间是b-tree index的1/384,大大降低了存储空间的开销。

三 关于partial index

1 什么是partial index?

分区索引是基于表中部分数据建立的,被索引的部分数据是原数据的一个子集。这个子集数据是基于某些条件可以明确计算出来的。

2 partial index适用场景?

对于表中数据分布“不均匀”的场景,比如某些数据占据绝大多数,而其它一些数据只占少量,且要满足查询条件针是对那些小数据量的查询情形。此例正好满足这种情况,总数据量为140万左右,而状态为01和02的数据只占极少的比例,且查询是针对状态为01和02的查询。

通过partial index可以排除大量普通数据或业务上“不感兴趣”的数据。如:一个在线订单系统,可以针对那些不在经常访问的客户端IP范围之外的IP进行创建分区索引,或者针对已下单还未支付的订单进行分区索引的创建。这样,当查询那些不在常用IP范围内的订单,或者那些已下单未支付的订单信息时,可以保证查询效率。

3 partial index的优势?

由于其是针对部分数据进行索引,显然同传统b-tree索引相比,其所占磁盘空间将大大缩小。当重建维护这些分区索引时,速度也非常快。

4 参考

partial index官方文档地址:
https://www.postgresql.org/docs/9.3/static/indexes-partial.html

索引扫描路径之3:Index Skip Scan

Index_Skip_Scan

在前面的两篇中,分别描述了Index Unique ScanIndex Range Scan,今天来描述Index Skip Scan。

 所谓的Index Skip Scan就是指,查询的WHERE条件中“SKIP”【跳过,没带上】了复合索引的前导列,只是指定了复合索引的其它列,这样CBO选择Index Skip Scan的路径来访问数据的方式。

如果复合索引的前导列唯一值越少(选择性低),非前导列索引唯一值越多(选择性高),则Index Skip Scan的性能越优。

事实上,在处理的过程中,Oracle将该复合索引上分为若干个logical subindexes(逻辑子索引),逻辑子索引的具体个数取决于前导列的唯一值值个数,即前导列有几个重复值就分为几个逻辑子索引。

Index Skip Scan适用的场景:

1 Index一定是一个复合索引(索引类型可以是UNIQUE的唯一索引,也可以是NONUIQUE的非唯一索引);
2 且复合索引的前导列重复值很多(即唯一值少),非前导列的重复值很少(唯一值多);
3 WHERE限制条件中,并没有带上该符合索引的前导列,仅仅是使用非前导列来过滤数据;

测试与验证:

1 创建测试表:

SQL> conn hr/hr
Connected.
SQL> create table skip_t as select object_id,object_name,object_type from all_objects;

Table created.

SQL> select count(*) from skip_t;

  COUNT(*)
----------
     55683

SQL> select count(distinct object_type) from skip_t;

COUNT(DISTINCTOBJECT_TYPE)
--------------------------
                        25

SQL> select count(distinct object_id) from skip_t;

COUNT(DISTINCTOBJECT_ID)
------------------------
                   55683

SQL> select count(distinct object_name) from skip_t;

COUNT(DISTINCTOBJECT_NAME)
--------------------------
                     30902

SQL>

从上,看到skip_t表数据情况:有55683条记录,且OBJECT_ID字段均唯一,无重复值。OBJECT_TYPE唯一值只有25个。
2 接下来,在OBJECT_TYPE和OBJECT_ID字段上创建复合索引,并收集统计信息:

SQL> create index idx_skip_t on skip_t(object_type,object_id);

Index created.

SQL> exec dbms_stats.gather_table_stats(ownname=>user,tabname=>'skip_t',cascade=>true);

PL/SQL procedure successfully completed.

SQL>

3 “Skip”掉复合索引前导列的情况,即让CBO按照预期的选择Index Skip Scan的路径来访问数据:

SQL> set autotrace trace
SQL> select object_type,object_id from skip_t where object_id=30000;

Execution Plan
----------------------------------------------------------
Plan hash value: 2119292092

-------------------------------------------------------------------------------
| Id  | Operation        | Name       | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------
|   0 | SELECT STATEMENT |            |     1 |    15 |    26   (0)| 00:00:01 |
|*  1 |  INDEX SKIP SCAN | IDX_SKIP_T |     1 |    15 |    26   (0)| 00:00:01 |
-------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - access("OBJECT_ID"=30000)
       filter("OBJECT_ID"=30000)

Statistics
----------------------------------------------------------
          1  recursive calls
          0  db block gets
         14  consistent gets
          0  physical reads
          0  redo size
        611  bytes sent via SQL*Net to client
        524  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          1  rows processed

SQL> select object_type,object_name from skip_t where object_id=30000;

Execution Plan
----------------------------------------------------------
Plan hash value: 1237951313

------------------------------------------------------------------------------------------
| Id  | Operation                   | Name       | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |            |     1 |    40 |    27   (0)| 00:00:01 |
|   1 |  TABLE ACCESS BY INDEX ROWID| SKIP_T     |     1 |    40 |    27   (0)| 00:00:01 |
|*  2 |   INDEX SKIP SCAN           | IDX_SKIP_T |     1 |       |    26   (0)| 00:00:01 |
------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("OBJECT_ID"=30000)
       filter("OBJECT_ID"=30000)

Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
         15  consistent gets
          0  physical reads
          0  redo size
        640  bytes sent via SQL*Net to client
        524  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          1  rows processed

SQL>

4 使用索引前导列的情况:

SQL> select object_type,object_id from skip_t where object_type='TABLE';

136 rows selected.

Execution Plan
----------------------------------------------------------
Plan hash value: 3840590361

-------------------------------------------------------------------------------
| Id  | Operation        | Name       | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------
|   0 | SELECT STATEMENT |            |  2227 | 33405 |     9   (0)| 00:00:01 |
|*  1 |  INDEX RANGE SCAN| IDX_SKIP_T |  2227 | 33405 |     9   (0)| 00:00:01 |
-------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - access("OBJECT_TYPE"='TABLE')

Statistics
----------------------------------------------------------
          1  recursive calls
          0  db block gets
         12  consistent gets
          0  physical reads
          0  redo size
       3592  bytes sent via SQL*Net to client
        623  bytes received via SQL*Net from client
         11  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
        136  rows processed

SQL> select object_type,object_name from skip_t where object_type='TABLE';

136 rows selected.

Execution Plan
----------------------------------------------------------
Plan hash value: 1483414757

------------------------------------------------------------------------------------------
| Id  | Operation                   | Name       | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |            |  2227 | 80172 |    43   (0)| 00:00:01 |
|   1 |  TABLE ACCESS BY INDEX ROWID| SKIP_T     |  2227 | 80172 |    43   (0)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN          | IDX_SKIP_T |  2227 |       |     9   (0)| 00:00:01 |
------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("OBJECT_TYPE"='TABLE')

Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
         40  consistent gets
          0  physical reads
          0  redo size
       6671  bytes sent via SQL*Net to client
        623  bytes received via SQL*Net from client
         11  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
        136  rows processed

SQL>

索引扫描路径之2:Index Range Scan

index_range_scan

在上一篇文章,索引扫描路径之1:Index Unique Scan 中,详细描述了Oracle的CBO优化器会在何种情形下选择通过Index Unique Scan的方式来访问数据。

本篇描述CBO优化器会在什么场景下选择Index Range Scan的路径来访问数据?

Index Range Scan通常见于通过索引去查找具有高选择性的数据。默认情况下,Oracle是按照被索引字段升序排序来存放索引记录的,如果被索引字段有重复值,则按照相应的ROWID做升序排序来存放。

较为常见的Index Range ScanWHERE条件如下:

Col1=:b1

Col1>:b1

Col1>=:b1

Col1<:b1

Col1<=:b1

或者是通过AND连接的满足上述可构成前导列的条件。再或者WHERE条件中有使用BETWEEN…AND条件。

同样,给出验证与说明:

系统版本、数据库版本和平台,以及测试表同与之前的环境。即选择Linux X86_64平台上的一套11.2.0.1.0库,使用HR这个schema下的departments表。

1 唯一索引,CBO选择Index Range Scan的场景

SQL> set autotrace trace
SQL> select * from departments where department_id>260;

Execution Plan
----------------------------------------------------------
Plan hash value: 3346631158

-------------------------------------------------------------------------------------------
| Id  | Operation                   | Name        | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |             |     1 |    20 |     2   (0)| 00:00:01 |
|   1 |  TABLE ACCESS BY INDEX ROWID| DEPARTMENTS |     1 |    20 |     2   (0)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN          | DEPT_ID_PK  |     1 |       |     1   (0)| 00:00:01 |
-------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("DEPARTMENT_ID">260)

Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
          2  consistent gets
          0  physical reads
          0  redo size
        743  bytes sent via SQL*Net to client
        492  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          1  rows processed

SQL>

从上,即使是从唯一索引中获取一条数据[满足department_id>260条件的记录,确实只有1条],优化器也选择了走INDEX RANGE SCAN,因为上述SQL并不满足Index Unique Scan的条件。恰恰满足了INDEX RANGE SCAN 的条件。

2 接下来,一个典型的使用INDEX RANGE SCAN 的场景:

SQL> select * from departments where location_id=1700;

21 rows selected.

Execution Plan
----------------------------------------------------------
Plan hash value: 735461860

------------------------------------------------------------------------------------------------
| Id  | Operation                   | Name             | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |                  |    21 |   420 |     2   (0)| 00:00:01 |
|   1 |  TABLE ACCESS BY INDEX ROWID| DEPARTMENTS      |    21 |   420 |     2   (0)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN          | DEPT_LOCATION_IX |    21 |       |     1   (0)| 00:00:01 |
------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("LOCATION_ID"=1700)

Statistics
----------------------------------------------------------
          1  recursive calls
          0  db block gets
          6  consistent gets
          1  physical reads
          0  redo size
       1468  bytes sent via SQL*Net to client
        503  bytes received via SQL*Net from client
          3  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
         21  rows processed

SQL>

或者:

SQL> select * from departments where location_id<=1700;

23 rows selected.

Execution Plan
----------------------------------------------------------
Plan hash value: 735461860

------------------------------------------------------------------------------------------------
| Id  | Operation                   | Name             | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |                  |    24 |   480 |     2   (0)| 00:00:01 |
|   1 |  TABLE ACCESS BY INDEX ROWID| DEPARTMENTS      |    24 |   480 |     2   (0)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN          | DEPT_LOCATION_IX |    24 |       |     1   (0)| 00:00:01 |
------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------

2 - access("LOCATION_ID"<=1700)

Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
          6  consistent gets
          0  physical reads
          0  redo size
       1502  bytes sent via SQL*Net to client
        503  bytes received via SQL*Net from client
          3  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
         23  rows processed

SQL>

再或者:

SQL> select * from departments where location_id>=1700;

25 rows selected.

Execution Plan
----------------------------------------------------------
Plan hash value: 735461860

------------------------------------------------------------------------------------------------
| Id  | Operation                   | Name             | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |                  |    25 |   500 |     2   (0)| 00:00:01 |
|   1 |  TABLE ACCESS BY INDEX ROWID| DEPARTMENTS      |    25 |   500 |     2   (0)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN          | DEPT_LOCATION_IX |    25 |       |     1   (0)| 00:00:01 |
------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("LOCATION_ID">=1700)

Statistics
----------------------------------------------------------
          1  recursive calls
          0  db block gets
          6  consistent gets
          0  physical reads
          0  redo size
       1561  bytes sent via SQL*Net to client
        503  bytes received via SQL*Net from client
          3  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
         25  rows processed

SQL>

可以看到,不管是location_id>=1700还是location_id=1700或者是location_id<=1700条件,CBO都选择了location_id字段上的索引DEPT_LOCATION_IX通过INDEX RANGE SCAN 的方式去获取数据。

3 接下来,BETWEEN…AND的场景:

SQL> select department_id,location_id,rowid from departments where location_id between 1500 and 2600;

25 rows selected.

Execution Plan
----------------------------------------------------------
Plan hash value: 735461860

------------------------------------------------------------------------------------------------
| Id  | Operation                   | Name             | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |                  |    26 |   494 |     2   (0)| 00:00:01 |
|   1 |  TABLE ACCESS BY INDEX ROWID| DEPARTMENTS      |    26 |   494 |     2   (0)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN          | DEPT_LOCATION_IX |    26 |       |     1   (0)| 00:00:01 |
------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

     2 - access("LOCATION_ID">=1500 AND "LOCATION_ID"<=2600)

Statistics
----------------------------------------------------------
          1  recursive calls
          0  db block gets
          6  consistent gets
          0  physical reads
          0  redo size
       1556  bytes sent via SQL*Net to client
        503  bytes received via SQL*Net from client
          3  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
         25  rows processed

SQL>

那么,如果是通过department_id字段来做BETWEEN…AND条件测试呢?即测试在一个唯一性索引上使用BETWEEN…AND条件的情形。

SQL> select department_id,location_id,rowid from departments where department_id between 120 and 130;

Execution Plan
----------------------------------------------------------
Plan hash value: 3346631158

-------------------------------------------------------------------------------------------
| Id  | Operation                   | Name        | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |             |     3 |    57 |     2   (0)| 00:00:01 |
|   1 |  TABLE ACCESS BY INDEX ROWID| DEPARTMENTS |     3 |    57 |     2   (0)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN          | DEPT_ID_PK  |     3 |       |     1   (0)| 00:00:01 |
-------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("DEPARTMENT_ID">=120 AND "DEPARTMENT_ID"<=130)

Statistics
----------------------------------------------------------
          1  recursive calls
          0  db block gets
          4  consistent gets
          0  physical reads
          0  redo size
        748  bytes sent via SQL*Net to client
        492  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          2  rows processed

SQL>

测试结果发现,即使是唯一索引,CBO依然会选择INDEX RANGE SCAN。
反之,如果条件换做是department_id between 120 and 120的话,很明显,聪明的CBO这次肯定会选择Index Unique Scan。

SQL> select department_id,location_id,rowid from departments where department_id between 120 and 120;

Execution Plan
----------------------------------------------------------
Plan hash value: 4024094692

-------------------------------------------------------------------------------------------
| Id  | Operation                   | Name        | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |             |     1 |    19 |     1   (0)| 00:00:01 |
|   1 |  TABLE ACCESS BY INDEX ROWID| DEPARTMENTS |     1 |    19 |     1   (0)| 00:00:01 |
|*  2 |   INDEX UNIQUE SCAN         | DEPT_ID_PK  |     1 |       |     0   (0)| 00:00:01 |
-------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("DEPARTMENT_ID"=120)

Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
          2  consistent gets
          0  physical reads
          0  redo size
        673  bytes sent via SQL*Net to client
        492  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          1  rows processed

SQL>

4 最后,看一个关于INDEX RANGE SCAN DESCENDING的场景:

CBO选择INDEX RANGE SCAN DESCENDING的方式来获取数据,适用于 ORDER BY column_name DESC的场景。在默认情况下,索引是按照索引关键字升序来存放数据的。

按照常理来讲,如果有ORDER BY column_name DESC的条件,那么

① 需要先从索引中读取数据;

② 再按照条件中column_name字段做降序排序。

而选择INDEX RANGE SCAN DESCENDING,则是直接在索引上按照索引关键字降序查找数据,这样正是为了避免先按照索引来查找数据,然后再做一次降序排序的操作。

SQL> set autotrace on
SQL> select department_id,location_id,rowid from departments where location_id between 1500 and 2600 order by location_id desc;

DEPARTMENT_ID LOCATION_ID ROWID
------------- ----------- ------------------
           80        2500 AAAz+PABRAAAACbAAH
           40        2400 AAAz+PABRAAAACbAAD
           20        1800 AAAz+PABRAAAACbAAB
          270        1700 AAAz+PABRAAAACbAAa
          260        1700 AAAz+PABRAAAACbAAZ
          250        1700 AAAz+PABRAAAACbAAY
          240        1700 AAAz+PABRAAAACbAAX
          230        1700 AAAz+PABRAAAACbAAW
          220        1700 AAAz+PABRAAAACbAAV
          210        1700 AAAz+PABRAAAACbAAU
          200        1700 AAAz+PABRAAAACbAAT

DEPARTMENT_ID LOCATION_ID ROWID
------------- ----------- ------------------
          190        1700 AAAz+PABRAAAACbAAS
          180        1700 AAAz+PABRAAAACbAAR
          170        1700 AAAz+PABRAAAACbAAQ
          160        1700 AAAz+PABRAAAACbAAP
          150        1700 AAAz+PABRAAAACbAAO
          140        1700 AAAz+PABRAAAACbAAN
          130        1700 AAAz+PABRAAAACbAAM
          120        1700 AAAz+PABRAAAACbAAL
          110        1700 AAAz+PABRAAAACbAAK
          100        1700 AAAz+PABRAAAACbAAJ
           90        1700 AAAz+PABRAAAACbAAI

DEPARTMENT_ID LOCATION_ID ROWID
------------- ----------- ------------------
           30        1700 AAAz+PABRAAAACbAAC
           10        1700 AAAz+PABRAAAACbAAA
           50        1500 AAAz+PABRAAAACbAAE

25 rows selected.

Execution Plan
----------------------------------------------------------
Plan hash value: 2689299823

-------------------------------------------------------------------------------------------------
| Id  | Operation                    | Name             | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |                  |    26 |   494 |     2   (0)| 00:00:01 |
|   1 |  TABLE ACCESS BY INDEX ROWID | DEPARTMENTS      |    26 |   494 |     2   (0)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN DESCENDING| DEPT_LOCATION_IX |    26 |       |     1   (0)| 00:00:01 |
-------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("LOCATION_ID">=1500 AND "LOCATION_ID"<=2600)

Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
          6  consistent gets
          0  physical reads
          0  redo size
       1575  bytes sent via SQL*Net to client
        534  bytes received via SQL*Net from client
          3  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
         25  rows processed

SQL>

从上可以看到,对于LOCATION_ID=1700的记录,则正是按照ROWID做的降序排序操作。

索引扫描路径之1:Index Unique Scan

index_unique_scan

近期在读一本名为《Expert Indexing in Oracle Database 11g》的书,个人觉得写的很不错,收获颇多。在这里就把一些关于CBO 通过INDEX 扫描数据的相关知识稍作整理,算是读书笔记。

其中,在读到第7章,Tuning Index Usage时,提到常见的通过索引访问数据的方式有下述几种:

Index Unique Scan;

Index Range Scan;

Index Skip Scan;

Index Full Scan;

Index Fast Full Scan;

        这里,首先说Index Unique Scan。

CBO【Cost Based Optimizer】通常在什么场景下会选择通过Index Unique Scan的方式来访问数据呢?

1 必须是通过唯一索引【UNIQUE】来访问数据;

2 通过唯一索引来访问数据时,每次返回的记录数必须是1条;

3 WHERE从句中必须要用等值【=】条件来过滤数据;

        我个人目前的理解是,必须同时满足上述3个条件,CBO才会选择通过Index Unique Scan的路径来访问数据。

下面给出验证:

1 操作系统版本和数据库版本:
操作系统:

[oracle@oracle11g ~]$ uname -rm
2.6.18-194.el5 x86_64
[oracle@oracle11g ~]$

数据库:

SQL> select * from v$version;

BANNER
--------------------------------------------------------------------------------
Oracle Database 11g Enterprise Edition Release 11.2.0.1.0 - 64bit Production
PL/SQL Release 11.2.0.1.0 - Production
CORE    11.2.0.1.0      Production
TNS for Linux: Version 11.2.0.1.0 - Production
NLSRTL Version 11.2.0.1.0 - Production

SQL>

2 测试表、表结构、表数据如下:

SQL> conn hr/hr
Connected.
SQL> desc departments;
 Name                                      Null?    Type
 ----------------------------------------- -------- ----------------------------
 DEPARTMENT_ID                             NOT NULL NUMBER(4)
 DEPARTMENT_NAME                           NOT NULL VARCHAR2(30)
 MANAGER_ID                                         NUMBER(6)
 LOCATION_ID                                        NUMBER(4)

SQL> select * from departments;

DEPARTMENT_ID DEPARTMENT_NAME                MANAGER_ID LOCATION_ID
------------- ------------------------------ ---------- -----------
           10 Administration                        200        1700
           20 Marketing                             201        1800
           30 Purchasing                            114        1700
           40 Human Resources                       203        2400
           50 Shipping                              121        1500
           60 IT                                    103        1400
           70 Public Relations                      204        2700
           80 Sales                                 145        2500
           90 Executive                             100        1700
          100 Finance                               108        1700
          110 Accounting                            205        1700

DEPARTMENT_ID DEPARTMENT_NAME                MANAGER_ID LOCATION_ID
------------- ------------------------------ ---------- -----------
          120 Treasury                                         1700
          130 Corporate Tax                                    1700
          140 Control And Credit                               1700
          150 Shareholder Services                             1700
          160 Benefits                                         1700
          170 Manufacturing                                    1700
          180 Construction                                     1700
          190 Contracting                                      1700
          200 Operations                                       1700
          210 IT Support                                       1700
          220 NOC                                              1700

DEPARTMENT_ID DEPARTMENT_NAME                MANAGER_ID LOCATION_ID
------------- ------------------------------ ---------- -----------
          230 IT Helpdesk                                      1700
          240 Government Sales                                 1700
          250 Retail Sales                                     1700
          260 Recruiting                                       1700
          270 Payroll                                          1700

27 rows selected.

SQL>

3 该表上的索引信息如下:

SQL> select index_name,index_type,table_name,uniqueness from user_indexes where table_name='DEPARTMENTS';

INDEX_NAME                     INDEX_TYPE                  TABLE_NAME                     UNIQUENES
------------------------------ --------------------------- ------------------------------ ---------
DEPT_ID_PK                     NORMAL                      DEPARTMENTS                    UNIQUE
DEPT_LOCATION_IX               NORMAL                      DEPARTMENTS                    NONUNIQUE

SQL> select * from user_ind_columns where table_name='DEPARTMENTS';

INDEX_NAME                     TABLE_NAME                     COLUMN_NAME     COLUMN_POSITION COLUMN_LENGTH CHAR_LENGTH DESC
------------------------------ ------------------------------ --------------- --------------- ------------- ----------- ----
DEPT_ID_PK                     DEPARTMENTS                    DEPARTMENT_ID                 1            22           0 ASC
DEPT_LOCATION_IX               DEPARTMENTS                    LOCATION_ID                   1            22           0 ASC

SQL>

从上看到departments表有2个索引,其中在DEPARTMENT_ID字段上的索引DEPT_ID_PK 为主键索引,是个UNIQUE类型的索引。而LOCATION_ID 字段上的DEPT_LOCATION_IX则为非唯一【NONUNIQUE】索引。

4 通过该表上的主键字段department_id来定位数据:

SQL> set autotrace trace
SQL> select department_id from departments where department_id=270;

Execution Plan
----------------------------------------------------------
Plan hash value: 1856623209

--------------------------------------------------------------------------------
| Id  | Operation         | Name       | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |            |     1 |     4 |     0   (0)| 00:00:01 |
|*  1 |  INDEX UNIQUE SCAN| DEPT_ID_PK |     1 |     4 |     0   (0)| 00:00:01 |
--------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - access("DEPARTMENT_ID"=270)

Statistics
----------------------------------------------------------
          1  recursive calls
          0  db block gets
          1  consistent gets
          0  physical reads
          0  redo size
        532  bytes sent via SQL*Net to client
        524  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          1  rows processed

SQL>

从上述执行计划里看到,CBO选择了Index Unique Scan,因为department_id字段上有一个DEPT_ID_PK 的UNIQUE类型的索引,且上述SQL满足Index Unique Scan的条件。

而如果WHERE条件换成department_id>=270时,则CBO选择了Index Range Scan的访问路径,因为该WHERE条件是>=而非等值匹配。CBO认为返回的记录数可能不止1条,即使实际情况下满足条件的记录数确实是1条,所以选择了Index Range Scan。

SQL> select department_id from departments where department_id>=270;

Execution Plan
----------------------------------------------------------
Plan hash value: 4271874676

-------------------------------------------------------------------------------
| Id  | Operation        | Name       | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------
|   0 | SELECT STATEMENT |            |     1 |     4 |     1   (0)| 00:00:01 |
|*  1 |  INDEX RANGE SCAN| DEPT_ID_PK |     1 |     4 |     1   (0)| 00:00:01 |
-------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - access("DEPARTMENT_ID">=270)

Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
          1  consistent gets
          0  physical reads
          0  redo size
        532  bytes sent via SQL*Net to client
        524  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          1  rows processed

SQL>

5 通过该表上的location_id字段来定位数据的场景:

SQL> select location_id from departments where location_id=2700;

Execution Plan
----------------------------------------------------------
Plan hash value: 2790119751

-------------------------------------------------------------------------------------
| Id  | Operation        | Name             | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT |                  |     1 |     3 |     1   (0)| 00:00:01 |
|*  1 |  INDEX RANGE SCAN| DEPT_LOCATION_IX |     1 |     3 |     1   (0)| 00:00:01 |
-------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - access("LOCATION_ID"=2700)

Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
          1  consistent gets
          0  physical reads
          0  redo size
        529  bytes sent via SQL*Net to client
        524  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          1  rows processed

SQL> select location_id from departments where location_id=1700;

21 rows selected.

Execution Plan
----------------------------------------------------------
Plan hash value: 2790119751

-------------------------------------------------------------------------------------
| Id  | Operation        | Name             | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT |                  |    21 |    63 |     1   (0)| 00:00:01 |
|*  1 |  INDEX RANGE SCAN| DEPT_LOCATION_IX |    21 |    63 |     1   (0)| 00:00:01 |
-------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - access("LOCATION_ID"=1700)

Statistics
----------------------------------------------------------
          1  recursive calls
          0  db block gets
          3  consistent gets
          0  physical reads
          0  redo size
        901  bytes sent via SQL*Net to client
        535  bytes received via SQL*Net from client
          3  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
         21  rows processed

SQL>

从上述执行计划看到,通过LOCATION_ID 字段上的DEPT_LOCATION_IX来定位数据时,不管是通过location_id=2700还是location_id=1400或者location_id=1700,结果都是走INDEX RANGE SCAN。因为DEPT_LOCATION_IX不是唯一索引,即使location_id=2700或者location_id=1400条件能确保返回的记录数只有1条,CBO也不会选择Index Unique Scan。

6  通过唯一索引、非唯一索引来定位数据的场景:

非唯一索引的情况:

SQL> create table all_t as select object_id,object_name from all_objects;

Table created.

SQL> desc all_t;
 Name                                      Null?    Type
 ----------------------------------------- -------- ----------------------------
 OBJECT_ID                                 NOT NULL NUMBER
 OBJECT_NAME                               NOT NULL VARCHAR2(30)

SQL> select table_name,index_name,index_type,uniqueness from user_indexes where table_name='ALL_T';

no rows selected

SQL> create index idx_all_t on all_t(object_id);

Index created.

SQL> select table_name,index_name,index_type,uniqueness from user_indexes where table_name='ALL_T';

TABLE_NAME                     INDEX_NAME                     INDEX_TYPE                  UNIQUENES
------------------------------ ------------------------------ --------------------------- ---------
ALL_T                          IDX_ALL_T                      NORMAL                      NONUNIQUE

SQL> exec dbms_stats.gather_table_stats(user,'all_t',cascade=>true);

PL/SQL procedure successfully completed.

SQL> set autotrace trace
SQL> select * from all_t where object_id=30000;

Execution Plan
----------------------------------------------------------
Plan hash value: 2459308467

-----------------------------------------------------------------------------------------
| Id  | Operation                   | Name      | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |           |     1 |    31 |     2   (0)| 00:00:01 |
|   1 |  TABLE ACCESS BY INDEX ROWID| ALL_T     |     1 |    31 |     2   (0)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN          | IDX_ALL_T |     1 |       |     1   (0)| 00:00:01 |
-----------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("OBJECT_ID"=30000)

Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
          4  consistent gets
          0  physical reads
          0  redo size
        627  bytes sent via SQL*Net to client
        523  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          1  rows processed

SQL>

唯一索引的情况:

SQL> drop index idx_all_t ;

Index dropped.

SQL> create unique index idx_all_t on all_t(object_id);

Index created.

SQL> select table_name,index_name,index_type,uniqueness from user_indexes where table_name='ALL_T';

TABLE_NAME                     INDEX_NAME                     INDEX_TYPE                  UNIQUENES
------------------------------ ------------------------------ --------------------------- ---------
ALL_T                          IDX_ALL_T                      NORMAL                      UNIQUE

SQL> set autotrace trace
SQL> exec dbms_stats.gather_table_stats(user,'all_t',cascade=>true);

PL/SQL procedure successfully completed.

SQL> select * from all_t where object_id=30000;

Execution Plan
----------------------------------------------------------
Plan hash value: 3932529197

-----------------------------------------------------------------------------------------
| Id  | Operation                   | Name      | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |           |     1 |    31 |     2   (0)| 00:00:01 |
|   1 |  TABLE ACCESS BY INDEX ROWID| ALL_T     |     1 |    31 |     2   (0)| 00:00:01 |
|*  2 |   INDEX UNIQUE SCAN         | IDX_ALL_T |     1 |       |     1   (0)| 00:00:01 |
-----------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("OBJECT_ID"=30000)

Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
          3  consistent gets
          0  physical reads
          0  redo size
        491  bytes sent via SQL*Net to client
        512  bytes received via SQL*Net from client
          1  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          1  rows processed

SQL>

从上可以看到,虽然测试表all_t上的字段object_id字段虽然的确可以确保唯一,但是在第一种情况下,索引IDX_ALL_T为非唯一索引的时候,CBO毅然选择了INDEX RANGE SCAN,而第二种情况下,却是Index Unique Scan。

也就是说,即使可以确保表中某个字段是唯一的,如果该字段上的索引不是唯一的,那么CBO选择的将是通过INDEX RANGE SCAN的路径来访问数据。想要表达的是,真实场景下这种情况的确会存在,因为有些系统是通过应用程序来保证表中的字段唯一,而并没有在表上对应的字段上通过创建唯一约束或者是唯一索引来保证数据的唯一性。那么,这样的话,将会导致CBO在选择执行计划的情况下,有可能会出现偏差。所以,对于这种情况下,即使应用程序可以保证数据的唯一性,最好还是在表上创建一个唯一索引比较稳妥。

最后,留一个小疑问给所有的看到这篇文章的朋友,对于第6种情况,是不是第二种场景下的Index Unique Scan的效率要高于第一种场景下的INDEX RANGE SCAN?因为,从执行计划来看的话,一致性读的次数分别为3和4。

关于AWR报告中几个命中率指标的初步解释

            从Oracle 10g开始,Oracle给广大DBA提供了一个性能优化的利器,那便是Automatic Workload Repository性能报告。

            在拿到一份AWR性能报告后,通过分析AWR报告来定位数据库性能问题时,在AWR报告的报告头中,我们会看到类似如下的一些命中率指标:

Instance Efficiency Percentages [Target 100%]

 

Buffer Nowait %: 99.87 Redo NoWait %: 99.95
Buffer Hit %: 95.89 In-memory Sort %: 100.00
Library Hit %: 86.87 Soft Parse %: 99.26
Execute to Parse %: 91.37 Latch Hit %: 99.73
Parse CPU to Parse Elapsd %: 53.78 % Non-Parse CPU: 98.18

 

           那么,这些关于Oracle内存的几个关键指标以及Instance效率的几个指标又该如何理解呢?

           1 这几个指标重要,但是通过这些命中率指标并非就可以定位到问题的关键所在。如上,我们看到各项指标基本都很高,除Parse CPU to Parse Elapsd %:只有53.78%之外,但是,该统计数据是来自于一则生产环境下出现严重性能问题的一个小时采样数据。

           2 分别对上述表格中各项指标作一初步解释:

               ① Buffer Nowait %:表示会话向Database Buffer Cache【数据高速缓冲区】 申请1个缓存时不等待的比例;

               ② Buffer Hit %:表示数据高速缓冲区的命中率,也叫Cache Hit Ratio。该指标要分实际业务系统类型来分析,如OLAP系统,该值可能为20%就算合理,而对于OLTP系统来讲,理想值应该在90%以上。当然,并非该值达到100%就没问题了,系统中可能依然难以避免物理读等待。计算脚本:

SELECT (1 - (phys.value / (db.value + cons.value))) * 100 AS "Buffer Cache Hit Ratio"                                            
FROM   v$sysstat phys,                                    
       v$sysstat db,                                      
       v$sysstat cons                                     
WHERE  phys.name  = 'physical reads'                      
AND    db.name    = 'db block gets'                       
AND    cons.name  = 'consistent gets';                    

               ③ Library Hit %:Library Cache Hit Ration【库高速缓冲区命中率】,表示向共享池的Library Cache中申请1个Library Cache Object对象时,其已经在Library Cache中存在的比例。该指标的一个合理值应该达到95%以上。计算脚本:

SELECT (1 -(Sum(reloads)/(Sum(pins) + Sum(reloads)))) * 100 AS "Library Cache Hit Ratio"                                              
FROM   v$librarycache;                                      

               ④ Execute to Parse %:表示执行解析比,目标是希望一次解析多次执行,计算公式=[1-(parse count (total)/(execute count)]%=[1-1257816/14576118]%=91.37%,其中parse count (total)来源于V$SYSSTAT中的parse count (total)字段值,execute count则取值于execute count的字段值。同时在同一份AWR报告中,parse count (total)execute count的值可以从AWR报告的Instance Activity Stats章节中获取,如下摘录

Instance Activity Stats

  • Ordered by statistic name
Statistic Total per Second per Trans
Batched IO (bound) vector count 560,211 157.69 28.15
CPU used by this session 1,434,831 403.88 72.10
。。。。。 。。。 。。。 。。。
execute count 14,576,118 4,102.96 732.43
。。。。。 。。。 。。。 。。。
parse count (describe) 9 0.00 0.00
parse count (failures) 28 0.01 0.00
parse count (hard) 9,364 2.64 0.47
parse count (total) 1,257,816 354.06 63.20
parse time cpu 26,723 7.52 1.34
parse time elapsed 49,687 13.99 2.50
redo entries 7,072,485 1,990.80 355.38
redo log space requests 3,665 1.03 0.18
。。。。。 。。。 。。。 。。。
sorts (disk) 7 0.00 0.00
sorts (memory) 22,108,325 6,223.16 1,110.92
。。。。。 。。。 。。。 。。。
。。。。。 。。。 。。。 。。。
write clones created in foreground 2,243 0.63 0.11

               ⑤ Parse CPU to Parse Elapsd %:该指标表示解析消耗的CPU时间与解析消耗的总时间的比值,目标同样是100%。我们当然希望解析的过程中,时间都消耗在CPU上,而不希望在解析的过程中,出现其他等待事件而拉长解析消耗的总时间。如果该指标偏低的话,说明在解析的过程中,除了消耗CPU资源外,还有其它等待事件,如等待共享池对象、闩锁。计算公式=[parse time cpu/parse time elapsed]%,parse time cpu和parse time elapsed同样来自于V$SYSSTAT,也可以参照AWR报告中Instance Activity Stats章节中的数据,如:Parse CPU to Parse Elapsd %:=[26723/49687]%=53.78%。

               ⑥ Redo NoWait %:表示会话写Redo Entry时不等待的比例。计算公式=[1-redo log space requests/redo entries]%,同样该两项指标来自于V$SYSSTAT字典表,也可以参照AWR报告中Instance Activity Stats章节中的数据,如Redo NoWait %:=[1-3665/7072485]%=[1-0.0005]%=99.95%。

                ⑦ In-memory Sort %:表示在内存中排序的比例。计算公式=[1-sorts (disk)/sorts (memory)]%,同样该两项指标来自于V$SYSSTAT字典表,也可以参照AWR报告中Instance Activity Stats章节中的数据,如In-memory Sort %:=[1-7/22108325]%=99.9999%。

         

               ⑧ Soft Parse %:表示软解析比例。计算公式=【1-parse count (hard)/parse count (total)】,同样该两项指标来自于V$SYSSTAT字典表,也可以参照AWR报告中Instance Activity Stats章节中的数据,如Soft Parse %:=[1-9364/1257816]%=99.26%。         

               ⑨ Latch Hit %:表示以 willing-to-wait 方式去获取内存栓锁的命中率指标,通常这个指标要求至少在99%以上,否则,很有可能意味着大量栓锁等待,影响性能。该值来源于V$LATCH字典表中的GETS和MISSES字段值计算脚本:

SELECT (1 - (Sum(misses) / Sum(gets))) * 100 AS "Latch Hit Ratio"                               
FROM   v$latch;                              

               ⑩ % Non-Parse CPU:表示除解析之外CPU的使用率,计算公式=【1-(parse time cpu)/(CPU used by this session)】%。同样该两项指标来自于V$SYSSTAT字典表,也可以参照AWR报告中Instance Activity Stats章节中的数据,如% Non-Parse CPU:=[1-26723/1434831]%=98.18%。        

同时遭遇row cache lock和enq: US – contention的等待事件

        上周五,接到项目组同事电话通知,说某客户应用系统无法登陆。我在应用服务器端用PL/SQL Developer尝试连接数据库服务器时,报错“ORA-00018:maximum number of sessions exceeded”,显然又是连接数不够用了。

         就电话回复同事说,赶紧检查一下各应用服务器的连接情况,原因是数据库连接数又不够用了。结果,同事接完电话之后,直接关闭了其中的一台IIS应用服务器,然后再启动这台IIS应用服务器。结果是,应用系统恢复了使用,大约20分钟后,却带了整个数据库的性能急剧下降,数据库Hung住,几乎不可用的状态。

        这是一套Windows 2003+10.2.0.5 X64的双节点RAC系统,接下来,就迅速抓取AWR报告,进行问题的定位:

        节点1的报告头:

  Snap Id Snap Time Sessions Cursors/Session
Begin Snap: 6981 01-3月 -13 14:00:17 64 14.9
End Snap: 6982 01-3月 -13 15:00:13 186 17.2
Elapsed:   59.94 (mins)    
DB Time:   2,215.01 (mins)    

        节点2的报告头:

  Snap Id Snap Time Sessions Cursors/Session
Begin Snap: 6981 01-3月 -13 14:00:14 65 14.2
End Snap: 6982 01-3月 -13 15:14:21 178 25.0
Elapsed:   74.12 (mins)    
DB Time:   2,991.16 (mins)    

        从上可以看到,在每个节点上,这一时段的数据库负载都很高,至少要比正常业务期间负载高出很多。同时,也看到,数据库连接数出现较为不太正常的连接。

       节点1的Top 5事件

Event Waits Time(s) Avg Wait(ms) % Total Call Time Wait Class
db file sequential read 210,093 48,731 232 36.7 User I/O
enq: US – contention 73,040 36,420 499 27.4 Other
log file sync 146,401 14,330 98 10.8 Commit
row cache lock 11,636 13,801 1,186 10.4 Concurrency
CPU time   9,314   7.0  

        节点2的Top 5事件:

Event Waits Time(s) Avg Wait(ms) % Total Call Time Wait Class
row cache lock 33,305 49,524 1,487 27.6 Concurrency
enq: US – contention 94,368 46,710 495 26.0 Other
db file sequential read 450,346 38,795 86 21.6 User I/O
CPU time   16,797   9.4  
direct path write temp 18,587 13,857 746 7.7 User I/O

        看到,在2个节点上均出现了row cache lock和enq: US – contention的等待事件,尤其是第2个节点上更为严重。对于row cache lock等待事件,之前曾遇到过相关案例,原因同样是由于高并发的RAC环境下,sequence没有CACHE,迅速定位并解决了这个问题。

        那么,这个enq: US – contention等待事件究竟是什么呢?Google之,找到了类似的案例:异常终止会话导致系统被Hung,以及ITPUB上的一篇帖子:row cache lock+us contention=宕机
        原来,导致enq: US – contention等待事件的原因是Undo表空间不够导致的。结合上述案例的提示,原来是因为同事直接停止IIS应用服务器,导致Oracle需要回滚之间的事务,这样,如果之前的事务比较大的话,那么整个回滚的时间也将越长。同时,还有一种可能就是,当初的ACTIVE事务因为停止IIS导致了被强制终止,这样一来,该事务占有的回滚段资源没有释放出来。等到IIS重启之后,新连接上来的会话因为事务操作,需要分配新的UNDO表空间,结果导致了enq: US – contention等待事件。

        参照上述的两则案例,找出紧急解决办法,由于是RAC,这里交叉重启了2个节点,最后问题得到解决。         

解决一则enq: TX – row lock contention的性能故障

            上周二早上,收到项目组的一封邮件:

       早上联代以下时间点用户有反馈EDI导入“假死”,我们跟踪了EDI导入服务,服务是正常在跑,可能是处理的慢所以用户感觉是“假死”了,请帮忙从数据库中检查跟踪以下时间点是否有“异常”操作,多谢!

        2012-11-20 9:10:10~~~~9:55:13,这个时间点内一共反馈了3次,大概是10~20分钟“假死”一次,请帮忙跟踪检查,多谢!

        这是一套Windows RAC的环境,也是之前处理  解决一则row cache lock引起的性能故障 那套环境。下面记录一下处理的经过:
1 对这一个小时进行AWR的收集和分析,首先,从报告头中看到DB Time达到近500分钟,(DB Time)/Elapsed=8,这个比值偏高:
  Snap Id Snap Time Sessions Cursors/Session
Begin Snap: 15142 20-11月-12 09:00:05 62 5.8
End Snap: 15143 20-11月-12 10:00:56 74 8.3
Elapsed:   60.85 (mins)    
DB Time:   492.88 (mins)    
 
2 再看TOP 5事件:
看到排在第一位的是enq: TX – row lock contention事件,也就是说系统中在这一个小时里产生了较为严重的行级锁等待事件。

Top 5 Timed Events

Event Waits Time(s) Avg Wait(ms) % Total Call Time Wait Class
CPU time   21,215   71.7  
enq: TX – row lock contention 12,232 6,013 492 20.3 Application
gc cr multi block request 14,696,067 1,675 0 5.7 Cluster
gc buffer busy 441,472 719 2 2.4 Cluster
db file sequential read 4,191 25 6 .1 User I/O
 
 
通常,产生enq: TX – row lock contention事件的原因有以下几种可能:
  • 不同的session更新或删除同一条记录;
  • 唯一索引有重复索引;
  • 位图索引同时被更新或同时并发的向位图索引字段上插入相同字段值;
  • 并发的对同一个数据块上的数据进行update操作;
  • 等待索引块完成分裂

同时,从段的统计信息章节中,也看到下面的信息:
Segments by Row Lock Waits
  • % of Capture shows % of row lock waits for each top segment compared
  • with total row lock waits for all segments captured by the Snapshot
Owner Tablespace Name Object Name Subobject Name Obj. Type Row Lock Waits % of Capture
SUNISCO SUNISCO_DATA1 BIND_PROCESS_LOG_REFNO   INDEX 159 67.66
SUNISCO FDN_EDI_I01 IDX_EDI_WORK_QUEUE_1   INDEX 29 12.34
SUNISCO SUNISCO_DATA1 IND_EDI_CUSTOMER_TYPE_CODE   INDEX 15 6.38
SUNISCO SUNISCO_DATA1 IDX_EDI_MESSAGE_1   INDEX 14 5.96
SUNISCO FDN_BASE_T01 BSE_NUM_LIST   TABLE 6 2.55



看到row lock waits发生在一个索引上。

3 那么,究竟是什么操作导致了这个enq: TX – row lock contention等待事件呢? 查看系统中,当前有哪些会话产生了enq: TX – row lock contention等待事件

SQL> select event,sid,p1,p2,p3 from v$session_wait where event='enq: TX - row lock contention';
 
EVENT                                                                   SID         P1         P2         P3
---------------------------------------------------------------- ---------- ---------- ---------- ----------
enq: TX - row lock contention                                           224 1415053316    1441815     144197
enq: TX - row lock contention                                           238 1415053316    1441815     144197
enq: TX - row lock contention                                           247 1415053316    1441815     144197
enq: TX - row lock contention                                           248 1415053316    1441815     144197
enq: TX - row lock contention                                           253 1415053316    1441815     144197
SQL> 

看到SID为224,238,247,248,253的会话产生enq: TX – row lock contention等待事件。

4 查看系统中的当前会话,是在哪个对象上产生了产生了enq: TX – row lock contention等待事件

SQL> select ROW_WAIT_OBJ#,ROW_WAIT_FILE#,ROW_WAIT_BLOCK#,ROW_WAIT_ROW# from v$session where event='enq: TX - row lock contention';
 
ROW_WAIT_OBJ# ROW_WAIT_FILE# ROW_WAIT_BLOCK# ROW_WAIT_ROW#
------------- -------------- --------------- -------------
       369195              0               0             0
       369195              0               0             0
       369195              0               0             0
       369195              0               0             0
       369195              0               0             0
       369195              0               0             0
 
6 rows selected
SQL>


5 那么这个数据库对象为369195的对象究竟是什么呢?

SQL> select object_name,object_id from dba_objects where object_id=369195;
 
OBJECT_NAME                          OBJECT_ID
----------------------------------- ----------
BIND_PROCESS_LOG_REFNO                  369195
SQL> select OWNER,OBJECT_NAME,OBJECT_ID,DATA_OBJECT_ID, OBJECT_TYPE from dba_objects where object_name='BIND_PROCESS_LOG_REFNO';
 
OWNER                          OBJECT_NAME                    OBJECT_ID DATA_OBJECT_ID OBJECT_TYPE
------------------------------ ----------------------------- ---------- -------------- -------------------
SUNISCO                        BIND_PROCESS_LOG_REFNO            369195         369195 INDEX
SQL>

可以看到,定位到的结果同上述AWR报告中段统计信息吻合,是SUNISCO这个用户下的一个索引。

6 接下来,继续看看SID为224,238,247,248,253的会话到底在执行哪些操作导致enq: TX – row lock contention等待事件

SQL> select sid,sql_text from v$session a,v$sql b where sid in(224,238,247,248,253) and (b.sql_id=a.sql_id or b.sql_id=a.prev_sql_id);
 
       SID SQL_TEXT
---------- --------------------------------------------------------------------------------
       224 select count(1)  from EDI_MESSAGE_PROCESS_LOG where   (LOG_ID =  :P_0_0  )
       224 INSERT INTO EDI_MESSAGE_PROCESS_LOG(LOG_ID, SERVICE_TYPE, SERVICE_STATUS, INFO_C
       238 select count(1)  from EDI_MESSAGE_PROCESS_LOG where   (LOG_ID =  :P_0_0  )
       238 INSERT INTO EDI_MESSAGE_PROCESS_LOG(LOG_ID, SERVICE_TYPE, SERVICE_STATUS, INFO_C
       247 INSERT INTO EDI_MESSAGE_PROCESS_LOG(LOG_ID, SERVICE_TYPE, SERVICE_STATUS, REFNO,
       247 INSERT INTO EDI_MESSAGE_PROCESS_LOG(LOG_ID, SERVICE_TYPE, SERVICE_STATUS, REFNO,
       248 INSERT INTO EDI_MESSAGE_PROCESS_LOG (LOG_ID, SERVICE_TYPE, SERVICE_STATUS, REFNO
       248 INSERT INTO EDI_MESSAGE_PROCESS_LOG (LOG_ID, SERVICE_TYPE, SERVICE_STATUS, REFNO
       248 SELECT SEQ_NEWID.NEXTVAL FROM DUAL
       253 SELECT SEQ_NEWID.NEXTVAL FROM DUAL
       253 INSERT INTO EDI_MESSAGE_PROCESS_LOG (LOG_ID, SERVICE_TYPE, SERVICE_STATUS, REFNO
 
11 rows selected
 
SQL>

看到有SQL_ID不同的SQL在同时向EDI_MESSAGE_PROCESS_LOG这张表执行INSERT操作

7 接下去看看EDI_MESSAGE_PROCESS_LOG这张表和索引BIND_PROCESS_LOG_REFNO之间有没有什么关系?

SQL> select index_name,table_name,index_type from user_indexes where table_name='EDI_MESSAGE_PROCESS_LOG';

INDEX_NAME                     TABLE_NAME                     INDEX_TYPE
------------------------------ ------------------------------ ----------
PK_EDI_MESSAGE_PROCESS_LOG     EDI_MESSAGE_PROCESS_LOG        NORMAL
ID_EDI_LOG_INPUT_DATE          EDI_MESSAGE_PROCESS_LOG        NORMAL
BIND_PROCESS_LOG_REFNO         EDI_MESSAGE_PROCESS_LOG        BITMAP

SQL> select index_name,table_name,column_name from user_ind_columns where table_name='EDI_MESSAGE_PROCESS_LOG';

INDEX_NAME                     TABLE_NAME                     COLUMN_NAM
------------------------------ ------------------------------ ----------
PK_EDI_MESSAGE_PROCESS_LOG     EDI_MESSAGE_PROCESS_LOG        LOG_ID
ID_EDI_LOG_INPUT_DATE          EDI_MESSAGE_PROCESS_LOG        INPUT_DATE
BIND_PROCESS_LOG_REFNO         EDI_MESSAGE_PROCESS_LOG        REFNO

SQL> select object_name,object_id,object_type,created from user_objects where object_name='BIND_PROCESS_LOG_REFNO';

OBJECT_NAME                     OBJECT_ID OBJECT_TYPE     CREATED
------------------------------ ---------- --------------- -------------------
BIND_PROCESS_LOG_REFNO             369195 INDEX           2012/11/05 10:18:28

SQL> select index_name,index_type from user_indexes where index_name='BIND_PROCESS_LOG_REFNO';

INDEX_NAME                      INDEX_TYPE
------------------------------- -----------
BIND_PROCESS_LOG_REFNO          BITMAP

SQL>

发现,这个索引BIND_PROCESS_LOG_REFNO是位于EDI_MESSAGE_PROCESS_LOG这张表的REFNO字段上的一个位图索引,而且是2012/11/05 10:18:28创建的,也就是说是近期才创建的1个位图索引

问题定位到这一步基本比较清晰了,产生enq: TX – row lock contention事件的原因就是上述的第2个可能原因:位图索引同时被更新或同时并发的向位图索引字段上插入相同字段值

8 那么,解决的办法也比较简单了,就是干掉这个位图索引,因为这个位图索引在这种应用场景下确实不太适合。事后,经过同客户方沟通确认,该索引是他们的一个DBA当初看到系统比较慢,而加上去的一个位图索引。

9 补充,从当时的ADDM报告中,也可以看到数据库给我们的建议:

FINDING 4: 20% impact (6013 seconds)
------------------------------------
发现 SQL 语句正处于行锁定等待。

   RECOMMENDATION 1: Application Analysis, 17% benefit (5131 seconds)
      ACTION: 在 INDEX "SUNISCO.BIND_PROCESS_LOG_REFNO" (对象 ID 为 369195)
         中检测到了严重的行争用。使用指定的阻塞 SQL 语句在应用程序逻辑中跟踪行争
用的起因。
         RELEVANT OBJECT: database object with id 369195
      RATIONALE: SQL_ID 为 "dr4uxu769tmmb" 的 SQL 语句在行锁上被阻塞。
         RELEVANT OBJECT: SQL statement with SQL_ID dr4uxu769tmmb
         INSERT INTO EDI_MESSAGE_PROCESS_LOG(LOG_ID, SERVICE_TYPE,
         SERVICE_STATUS, LOG_DATETIME, REFNO, REF_TYPE, MSG_ID, BL_NO, BL_ID,
         VOYAGE_ID, VESSEL_NAME, IMO_NO, VOYAGE_NO, FUNCTION_TYPE, INPUT_DATE,
         IN_STATUS, SYSTEM_TYPE, ERROR_LOG, FILE_NAME) VALUES ( :B1 , :B2 ,
         :B3 , :B4 , :B5 , :B6 , :B7 , :B8 , :B9 , :B10 , :B11 , :B12 , :B13 ,
         :B14 , :B15 , :B16 , :B17 , :B18 , :B19 )
      RATIONALE: SQL_ID 为 "dxsbgubsb6r4n" 的 SQL 语句在行锁上被阻塞。
         RELEVANT OBJECT: SQL statement with SQL_ID dxsbgubsb6r4n
         INSERT INTO EDI_MESSAGE_PROCESS_LOG(LOG_ID, SERVICE_TYPE,
         SERVICE_STATUS, INFO_CODE, INFORMATION, INFO_LEVEL, LOG_DATETIME,
         REFNO, REF_TYPE, MSG_ID, BL_NO, VOYAGE_ID, VESSEL_NAME, IMO_NO,
         VOYAGE_NO, FUNCTION_TYPE, INPUT_DATE, IN_STATUS, SYSTEM_TYPE,
         ERROR_LOG, FILE_NAME) VALUES ( :B1 , :B2 , :B3 , :B4 , :B5 , :B6 ,
         :B7 , :B8 , :B9 , :B10 , :B11 , :B12 , :B13 , :B14 , :B15 , :B16 ,
         :B17 , :B18 , :B19 , :B20 , :B21 )
      RATIONALE: SQL_ID 为 "b38qhyzvn5bdd" 的 SQL 语句在行锁上被阻塞。
         RELEVANT OBJECT: SQL statement with SQL_ID b38qhyzvn5bdd
         INSERT INTO EDI_MESSAGE_PROCESS_LOG(LOG_ID, SERVICE_TYPE,
         SERVICE_STATUS, LOG_DATETIME, REFNO, REF_TYPE, MSG_ID, BL_NO,
         VOYAGE_ID, VESSEL_NAME, IMO_NO, VOYAGE_NO, FUNCTION_TYPE, INPUT_DATE,
         IN_STATUS, SYSTEM_TYPE, ERROR_LOG, FILE_NAME) VALUES ( :B1 , :B2 ,
         :B3 , :B4 , :B5 , :B6 , :B7 , :B8 , :B9 , :B10 , :B11 , :B12 , :B13 ,
         :B14 , :B15 , :B16 , :B17 , :B18 )
      RATIONALE: SQL_ID 为 "36k2xpx3c6wr5" 的 SQL 语句在行锁上被阻塞。
         RELEVANT OBJECT: SQL statement with SQL_ID 36k2xpx3c6wr5
         INSERT INTO EDI_MESSAGE_PROCESS_LOG(LOG_ID, SERVICE_TYPE,
         SERVICE_STATUS, REFNO, REF_TYPE, MSG_ID, BL_NO, VOYAGE_ID,
         VESSEL_NAME, IMO_NO, VOYAGE_NO, FUNCTION_TYPE, INPUT_DATE, IN_STATUS,
         SYSTEM_TYPE, ERROR_LOG, FILE_NAME) VALUES ( :B1 , :B2 , :B3 , :B4 ,
         :B5 , :B6 , :B7 , :B8 , :B9 , :B10 , :B11 , :B12 , :B13 , :B14 , :B15
         , :B16 , :B17 )
      RATIONALE: 具有 ID "268", 用户 ID "31", 程序 "FC.EdiService.Import.exe" 和
模块
         "FC.EdiService.Import.exe" 的会话是构成此建议案中的优化建议的 51% 的阻
塞会话。
      RATIONALE: 具有 ID "307", 用户 ID "31", 程序 "FC.EdiService.Import.exe" 和
模块
         "FC.EdiService.Import.exe" 的会话是构成此建议案中的优化建议的 11% 的阻
塞会话。
      RATIONALE: 具有 ID "227", 用户 ID "31", 程序 "FC.EdiService.Import.exe" 和
模块
         "FC.EdiService.Import.exe" 的会话是构成此建议案中的优化建议的 11% 的阻
塞会话。
      RATIONALE: 具有 ID "273", 用户 ID "31", 程序 "FC.EdiService.Import.exe" 和
模块
         "FC.EdiService.Import.exe" 的会话是构成此建议案中的优化建议的 9% 的阻塞
会话。


10 最后,从本案例中,可以看到在日常的数据库维护中,添加或修改一些对象信息时,务必要经过严格的测试,尤其是在生产系统上做调整更应如此。同样,可以看出,数据库的一些理论基础知识对于DBA还是蛮重要的。

 

解决VARCHAR2和NVARCHAR2隐式数据类型转换导致的性能问题

           前段时间,接到公司一项目组的数据库优化需求:一个计算费用的存储过程执行起来非常之慢,需要优化。

           拿到存储过程之后,快速看了下代码,然后通过PL/SQL Developer去查看SQL代码的执行计划。其中,看到下面的一条简单的SQL语句的执行计划有点儿问题,导致主表CA_SE_MANIFEST走的全表扫描,而该表的数据量很大,而TMP_COM_ID2又是一张临时表。可是主表CA_SE_MANIFEST的BL_NO_ID字段是主键字段,有唯一索引,却没有走索引,这样的话,执行效率应该不会好到哪儿去!

           SQL语句:

DELETE FROM TMP_COM_ID2 WHERE EXISTS(SELECT 1 FROM CA_SE_MANIFEST WHERE BL_NO_ID = C_ID AND DOCUMENT_TYPE IN ('1','2'))

           执行计划:

           经过分析,发现表TMP_COM_ID2的结构信息如下,只有1个字段,其中C_ID字段的数据类型是NVARCHAR2

SQL> desc tmp_com_id2
Name Type          Nullable Default Comments 
---- ------------- -------- ------- -------- 
C_ID NVARCHAR2(20)                           
 
SQL> 

           而主表CA_SE_MANIFEST的BL_NO_ID字段数据类型却是VARCHAR2

SQL> desc ca_se_manifest
Name                      Type         Nullable Default Comments                                                                                                                                                                                                           
------------------ ------------------- -------- ------- -------------------------------- 
BL_NO_ID           VARCHAR2(20 BYTE)                    出口提单序号                                                                                                                                                                                                       
BL_NO              VARCHAR2(40 BYTE)                    同一船名、航次下、分公司的提单号不重复                                                                                                                                             
CARRIER_BL_NO      VARCHAR2(20 BYTE)   Y                船公司提单号 
...
...

           基本上定位到了原因,是由于VARCHAR2和NVARCHAR2数据类型不一致导致的隐式数据类型转换,进而导致主表CA_SE_MANIFEST的主键字段BL_NO_ID上的索引失效,最终产生了全表扫描

           经过分析,可以将临时表TMP_COM_ID2的C_ID字段数据类型改成VARCHAR2,而主表CA_SE_MANIFEST的定义信息显然不可轻易修改。修改完临时表的字段数据类型之后,该条SQL的执行计划已经不再是CA_SE_MANIFEST的全表扫描了:

          最终,这个小问题得以解决。

          补充:① VARCHAR2 可以认为是VARCHAR的变种,它是一个变长的字符串数据类型;

                    ②NVARCHAR2 是包含UNICODE编码格式数据的变长字符串。