PostgreSQL数据库SQL性能优化案例二则

零 背景与说明

数据库的性能会影响整个业务系统的性能,而数据库的性能问题,90%以上的几率都是由性能低下的SQL导致的。

一个即将上线的新系统,开发同事发过来2个SQL优化的请求。

PostgreSQL数据库SQL性能优化案例二则

一 SQL一优化分析处理思路

1 格式化原始SQL:

SELECT COUNT(b.code) AS totalCount, b.status AS status	
FROM t_ai_prd_bill b 
WHERE b.status != '900' AND b.is_deleted = FALSE 
GROUP BY b.status

2  原始SQL执行计划

ai=> explain analyze SELECT COUNT(b.code) AS totalCount, b.status AS status 
ai-> FROM t_ai_prd_bill b 
ai-> WHERE b.status != '900' AND b.is_deleted = FALSE 
ai-> GROUP BY b.status;
                                                          QUERY PLAN                                                          
------------------------------------------------------------------------------------------------------------------------------
 HashAggregate  (cost=3870.50..3870.51 rows=1 width=17) (actual time=12.660..12.661 rows=5 loops=1)
   Group Key: status
   ->  Seq Scan on t_ai_prd_bill b  (cost=0.00..3839.75 rows=6150 width=17) (actual time=0.013..11.139 rows=6153 loops=1)
         Filter: ((NOT is_deleted) AND (status <> '900'::bpchar))
         Rows Removed by Filter: 32227
 Planning time: 0.165 ms
 Execution time: 12.719 ms
(7 rows)

ai=>

3 获取t_ai_prd_bill表结构,数据量和数据分布

 
ai=> \d t_ai_prd_bill                                      
                         Table "ai.t_ai_prd_bill"
       Column       |              Type              |      Modifiers      
--------------------+--------------------------------+---------------------
 id                 | character varying(32)          | not null
 code               | character varying(12)          | 
 packet_code        | character varying(10)          | 
 status             | character(3)                   | 
 is_deleted         | boolean                        | default false
 classify_type      | character varying(1)           | 
 last_oper_time     | timestamp(6) without time zone | 
 template_no        | character varying(32)          | 
 oss_key            | character varying(255)         | 
 receive_time       | timestamp(6) without time zone | 
 is_question        | boolean                        | default false
 ocr_type           | character(1)                   | 
 cust_id            | character varying(32)          | 
 exception_category | character(1)                   | default '0'::bpchar
 exception_detail   | character varying(400)         | 
 is_suspended       | boolean                        | 
 is_urgent          | boolean                        | 
 raw_deadline       | timestamp(6) without time zone | 
 priority           | smallint                       | 
 deadline           | timestamp(6) without time zone | 
 company_id         | character varying(32)          | 
 company_name       | character varying(255)         | 
 cust_category      | character varying(32)          | 
 source             | character varying(32)          | 
 mode_code          | character varying(32)          | 
 deadline_category  | character varying(3)           | 
 location_id        | character varying(32)          | 
 cust_name          | character varying(128)         | 
 template_category  | character varying(32)          | 
 platform_id        | character varying(32)          | 
 country_name       | character varying(64)          | 
 is_sended          | boolean                        | default false
Indexes:
    "t_ai_prd_bill_pkey" PRIMARY KEY, btree (id)
    "idx_bill_code" btree (code)
    "idx_bill_create_time_status_deleted" btree (receive_time, status, is_deleted)
    "idx_bill_cust_id" btree (cust_id)
    "idx_bill_packet_code" btree (packet_code)
    "idx_bill_status" btree (status)

ai=> select status,count(*) from t_ai_prd_bill group by status;
 status | count 
--------+-------
 400    |  2511
 401    |   183
 500    |  3174
 600    |     1
 701    |   284
 900    | 32227
(6 rows)

ai=>

4 改写等价SQL

结合上述的t_ai_prd_bill表结构、index信息、以及数据量来分析其执行计划。

t_ai_prd_bill表中的status字段,表示业务状态。status=900表示已完成业务操作的票据,status=901表示理票失败的票据,且随着业务正常流转,该表中status=900的记录会越来越多,占据数据量的绝大多数,同时status=901的记录应该非常少。也就是说,这条SQL原本是想查询在业务上除理票已完成的所有其它状态票据信息,即(status != 900)。

这样的话,如果从业务端着手,考虑将status=901表示理票失败的票据,如改为status=899表示理票失败。则,SQL可以等价改造为:

SELECT COUNT(b.code) AS totalCount, b.status AS status	
FROM t_ai_prd_bill b 
WHERE b.status < '900' AND b.is_deleted = FALSE 
GROUP BY b.status

5 新SQL执行计划

ai=> explain analyze SELECT COUNT(b.code) AS totalCount, b.status AS status 
ai-> FROM t_ai_prd_bill b 
ai-> WHERE b.status < '900' AND b.is_deleted = FALSE ai-> GROUP BY b.status;
                                                                QUERY PLAN                                                                
------------------------------------------------------------------------------------------------------------------------------------------
 HashAggregate  (cost=3782.19..3782.20 rows=1 width=17) (actual time=5.423..5.424 rows=5 loops=1)
   Group Key: status
   ->  Bitmap Heap Scan on t_ai_prd_bill b  (cost=247.95..3751.44 rows=6150 width=17) (actual time=0.996..3.811 rows=6153 loops=1)
         Recheck Cond: (status < '900'::bpchar) Filter: (NOT is_deleted) Heap Blocks: exact=1213 ->  Bitmap Index Scan on idx_bill_status  (cost=0.00..246.41 rows=6150 width=0) (actual time=0.825..0.825 rows=6153 loops=1)
               Index Cond: (status < '900'::bpchar) 
Planning time: 0.156 ms 
Execution time: 5.480 ms 
(10 rows) 
ai=>

 

6 小结

通过改写SQL前后的执行计划对比,执行时间从12ms降低到5ms。且,随着业务的推进,t_ai_prd_bill表中数据量越来越多,优化带来的效果也会越来越好。

这里,由于业务上将票据状态定义为’100,200,300,400,401,500,600,402,700,701,800,900,901’,每个代码表示每个不同的业务中间状态。该业务需求是想查询统计业务状态不成功的所有票据,程序员自然想到的是通过status !=’900’来过滤数据。却忽视了数据库index里,优化器对于不等于 !=的where条件,走不到index。

 

二 SQL二优化分析处理思路

1 原始SQL

select a.status, sum(case when b.status is null then 0 else 1 end),
        sum(case when b.cust_category = 'DZ001' then 1 else 0 end) as "zyd"
        , sum(case when b.cust_category = 'ZEJ001' then 1 else 0 end) as "zyj"
        , sum(case when b.cust_category = 'MY001' then 1 else 0 end) as "my"
        from (select regexp_split_to_table('100,200,300,400,401,500,600,402,700,701,800,901', ',') as status) a
        left join t_ai_prd_bill b on (b.status = a.status)
        WHERE b.is_deleted = FALSE
        group by a.status
        order by a.status;

2 原始SQL执行计划

ai=> explain analyze select a.status, sum(case when b.status is null then 0 else 1 end),
ai->         sum(case when b.cust_category = 'DZ001' then 1 else 0 end) as "zyd"
ai->         , sum(case when b.cust_category = 'ZEJ001' then 1 else 0 end) as "zyj"
ai->         , sum(case when b.cust_category = 'MY001' then 1 else 0 end) as "my"
ai->         from (select regexp_split_to_table('100,200,300,400,401,500,600,402,700,701,800,901', ',') as status) a
ai->         left join t_ai_prd_bill b on (b.status = a.status)
ai->         WHERE b.is_deleted = FALSE
ai->         group by a.status
ai->         order by a.status;
                                                                 QUERY PLAN                                                                 
--------------------------------------------------------------------------------------------------------------------------------------------
 GroupAggregate  (cost=6328.61..15390.61 rows=200 width=42) (actual time=70.104..75.647 rows=5 loops=1)
   Group Key: (regexp_split_to_table('100,200,300,400,401,500,600,402,700,701,800,901'::text, ','::text))
   ->  Merge Join  (cost=6328.61..10560.61 rows=241400 width=42) (actual time=43.422..57.495 rows=35869 loops=1)
         Merge Cond: ((regexp_split_to_table('100,200,300,400,401,500,600,402,700,701,800,901'::text, ','::text)) = ((b.status)::text))
         ->  Sort  (cost=64.84..67.34 rows=1000 width=32) (actual time=0.117..0.119 rows=12 loops=1)
               Sort Key: (regexp_split_to_table('100,200,300,400,401,500,600,402,700,701,800,901'::text, ','::text))
               Sort Method: quicksort  Memory: 25kB
               ->  Result  (cost=0.00..5.01 rows=1000 width=0) (actual time=0.094..0.102 rows=12 loops=1)
         ->  Sort  (cost=6263.78..6384.48 rows=48280 width=10) (actual time=43.293..46.852 rows=48280 loops=1)
               Sort Key: ((b.status)::text)
               Sort Method: quicksort  Memory: 3629kB
               ->  Seq Scan on t_ai_prd_bill b  (cost=0.00..2507.80 rows=48280 width=10) (actual time=0.012..30.322 rows=48280 loops=1)
                     Filter: (NOT is_deleted)
 Planning time: 0.367 ms
 Execution time: 75.737 ms
(15 rows)

ai=>

3 分析SQL及其执行计划

原始SQL是想查询统计t_ai_prd_bill表,按照status字段来分组排序。这里,绕了一个弯路,通过引入函数regexp_split_to_table来构造一个单列的虚拟表a,再使其与t_ai_prd_bill表做一次left join,没有必要。同时,从执行计划里看到有个较大的内存排序,Sort Method: quicksort Memory: 3629kB。

4 尝试改写等价SQL

select  b.status,sum(case when b.status is null then 0 else 1 end),
        sum(case when b.cust_category = 'DZ001' then 1 else 0 end) as "zyd"
        , sum(case when b.cust_category = 'ZEJ001' then 1 else 0 end) as "zyj"
        , sum(case when b.cust_category = 'MY001' then 1 else 0 end) as "my"
        from t_ai_prd_bill b 
        WHERE b.is_deleted = FALSE
        group by b.status
        order by b.status;

5 改写后的SQL执行计划

ai=> explain analyze select  b.status,sum(case when b.status is null then 0 else 1 end),
ai->         sum(case when b.cust_category = 'DZ001' then 1 else 0 end) as "zyd"
ai->         , sum(case when b.cust_category = 'ZEJ001' then 1 else 0 end) as "zyj"
ai->         , sum(case when b.cust_category = 'MY001' then 1 else 0 end) as "my"
ai->         from t_ai_prd_bill b 
ai->         WHERE b.is_deleted = FALSE
ai->         group by b.status
ai->         order by b.status;
                                                              QUERY PLAN                                                              
--------------------------------------------------------------------------------------------------------------------------------------
 Sort  (cost=3473.54..3473.55 rows=6 width=10) (actual time=49.986..49.987 rows=6 loops=1)
   Sort Key: status
   Sort Method: quicksort  Memory: 25kB
   ->  HashAggregate  (cost=3473.40..3473.46 rows=6 width=10) (actual time=49.932..49.934 rows=6 loops=1)
         Group Key: status
         ->  Seq Scan on t_ai_prd_bill b  (cost=0.00..2507.80 rows=48280 width=10) (actual time=0.008..11.934 rows=48280 loops=1)
               Filter: (NOT is_deleted)
 Planning time: 0.109 ms
 Execution time: 50.060 ms
(9 rows)

ai=>

6 小结

通过分析,改写SQL,前后对比执行计划,发现减少了不必要的内存排序操作,进而让数据库执行SQL响应更快,提升效率。

 

配置scp,ssh,rsync到另外一台主机无需密码的另外一种方式

配置scp,ssh,rsync到另外一台主机无需密码的另外一种方式

一 场景需求:

一台AIX主机,一台Linux主机。
需用从AIX主机上通过oracle用户SCP传输文件到Linux主机,且要求SCP传输文件的过程中,要脚本自动化处理,无需人工干预,即免去手动键入密码的环节。

可是AIX上没有ssh-copy-id的可用命令



# ssh-copy-id   
ksh: ssh-copy-id:  not found
# uname -M
IBM,8202-E4B
# uname -n
usp720
# uname -a
AIX usp720 1 6 00F67A854C00
# oslevel 
6.1.0.0
# 

二 解决方法:

1 AIX上,Oracle用户执行,ssh-keygen -t rsa:

ssh-keygen -t rsa
执行过程中,一路Enter回车即可。

2 AIX上,Oracle用户执行,cat ~/.ssh/id_rsa.pub文件内容:

3 Linux上,Oracle用户执行,把步骤2中的文件内容,写入~/.ssh/authorized_keys:

同时,修改~/.ssh/authorized_keys的文件权限为700。

 
chmod 700~/.ssh/authorized_keys

4 AIX上,执行:

 
bash-4.3$ hostname
usp720
bash-4.3$ ssh oracle@172.18.1.12
Last login: Wed Mar 14 16:41:42 2018 from 172.18.1.1
localhost-> hostname
localhost.localdomain
localhost-> 

 

三 小记:

早在2008年的时候,在学习Oracle 10gR2 RAC在CentOS 4.8上安装部署时,配置RAC双节点,Oracle用户的互信,就采用的这种方法。10年前学的知识点被唤醒了。