您好,欢迎光临! 请 |

PostgreSQL统计信息不准导致错误的执行计划

作者: | 分类:PostgreSQL | Tag: | 评论:0 
字号:T|T

接项目组同事反馈,一条SQL,在beta环境足够快,生产环境就慢很多。对于这种问题,我通常都需要额外的向他们解释一会儿,比如数据量分布,统计信息,不同的执行计划等.... 总之,这种现象是一个完全正常的现象。

一 首先,那么,就直接从生产环境上看一下SQL的执行计划:

cattle=> explain analyze SELECT
cattle-> ci.*, fa.farm_name AS farmName,
cattle-> cs.cowshed_no AS cowshedNo
cattle-> FROM
cattle-> t_cattle_ca_cattleinfo AS ci
cattle-> LEFT JOIN t_cattle_crm_cust_info AS cci ON ci.owner_id = cci. ID
cattle-> LEFT JOIN t_cattle_bd_farm AS fa ON ci.farm_id = fa. ID
cattle-> LEFT JOIN t_cattle_bd_cowshed AS cs ON ci.dorm_id = cs. ID
cattle-> LEFT JOIN t_cattle_sys_company t1 ON ci.owner_id = t1. ID
cattle-> WHERE
cattle-> ci. FLAG = 1
cattle-> AND ci.corp_id = 'OAO_SYSTEMINTIAL_000000000000000'
cattle-> ORDER BY
cattle-> ci.create_time DESC
cattle-> LIMIT 10 OFFSET (1- 1);
                                                                           QUERY PLAN                                                                            
-----------------------------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=305987.09..305987.12 rows=10 width=766) (actual time=1527.412..1527.415 rows=10 loops=1)
   ->  Sort  (cost=305987.09..306338.21 rows=140447 width=766) (actual time=1527.411..1527.414 rows=10 loops=1)
         Sort Key: ci.create_time
         Sort Method: top-N heapsort  Memory: 35kB
         ->  Hash Left Join  (cost=82.84..302952.08 rows=140447 width=766) (actual time=1313.868..1504.013 rows=24673 loops=1)
               Hash Cond: ((ci.dorm_id)::text = (cs.id)::text)
               ->  Hash Left Join  (cost=3.81..301095.66 rows=140447 width=759) (actual time=1313.053..1492.219 rows=24673 loops=1)
                     Hash Cond: ((ci.farm_id)::text = (fa.id)::text)
                     ->  Seq Scan on t_cattle_ca_cattleinfo ci  (cost=0.00..299160.71 rows=140447 width=738) (actual time=1312.957..1474.099 rows=24673 loops=1)
                           Filter: (((flag)::text = '1'::text) AND ((corp_id)::text = 'OAO_SYSTEMINTIAL_000000000000000'::text))
                     ->  Hash  (cost=3.36..3.36 rows=36 width=54) (actual time=0.043..0.043 rows=36 loops=1)
                           Buckets: 1024  Batches: 1  Memory Usage: 4kB
                           ->  Seq Scan on t_cattle_bd_farm fa  (cost=0.00..3.36 rows=36 width=54) (actual time=0.018..0.027 rows=36 loops=1)
               ->  Hash  (cost=63.46..63.46 rows=1246 width=40) (actual time=0.796..0.796 rows=1246 loops=1)
                     Buckets: 1024  Batches: 1  Memory Usage: 89kB
                     ->  Seq Scan on t_cattle_bd_cowshed cs  (cost=0.00..63.46 rows=1246 width=40) (actual time=0.003..0.438 rows=1246 loops=1)
 Total runtime: 1527.576 ms
(17 rows)

cattle=>

从上,可以看到,优化器在此处对于t_cattle_ca_cattleinfo表选择的是全表扫描。且耗时大概1474毫秒左右,而整个SQL的总耗时为1527.576 ms,说明绝大部分的时间开销都花费在对于t_cattle_ca_cattleinfo表的访问上。

二  接下来,分析t_cattle_ca_cattleinfo表的数据分布信息:

cattle=> select count(*) from t_cattle_ca_cattleinfo;                                                   
 count 
-------
 24673
(1 row)

cattle=> select count(*) from t_cattle_ca_cattleinfo where corp_id = 'OAO_SYSTEMINTIAL_000000000000000';
 count 
-------
 24673
(1 row)

cattle=> select count(*) from t_cattle_ca_cattleinfo where corp_id = 'OAO_SYSTEMINTIAL_000000000000000' and FLAG = 1;
 count 
-------
 24673
(1 row)

cattle=>

结合上述,看到t_cattle_ca_cattleinfo表总计24673条记录,而且满足corp_id = 'OAO_SYSTEMINTIAL_000000000000000' and FLAG = 1条件的数据,就是全表的数据总量。

但是,结合上述执行计划来看,优化器认为该表有rows=140447,可是手工查询到该表当前只有24673条记录。说明,统计信息出了问题。

三 然后,手工在线收集表统计信息:

cattle=> vacuum FREEZE ANALYZE verbose t_cattle_ca_cattleinfo;
INFO:  vacuuming "cattle.t_cattle_ca_cattleinfo"
INFO:  scanned index "t_cattle_ca_cattleinfo_pkey" to remove 4763 row versions
DETAIL:  CPU 0.08s/0.12u sec elapsed 0.24 sec.
INFO:  scanned index "cattleCode_index" to remove 4763 row versions
DETAIL:  CPU 0.02s/0.04u sec elapsed 0.07 sec.
INFO:  scanned index "corpId_index" to remove 4763 row versions
DETAIL:  CPU 0.08s/0.13u sec elapsed 0.22 sec.
INFO:  scanned index "farmId_index" to remove 4763 row versions
DETAIL:  CPU 0.07s/0.12u sec elapsed 0.20 sec.
INFO:  scanned index "feedId_index" to remove 4763 row versions
DETAIL:  CPU 0.09s/0.10u sec elapsed 0.21 sec.
INFO:  scanned index "manageCode_index" to remove 4763 row versions
DETAIL:  CPU 0.00s/0.03u sec elapsed 0.04 sec.
INFO:  scanned index "rfid_index" to remove 4763 row versions
DETAIL:  CPU 0.13s/0.12u sec elapsed 0.26 sec.
INFO:  "t_cattle_ca_cattleinfo": removed 4763 row versions in 1052 pages
DETAIL:  CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO:  index "t_cattle_ca_cattleinfo_pkey" now contains 24673 row versions in 31871 pages
DETAIL:  4763 index row versions were removed.
14490 index pages have been deleted, 14490 are currently reusable.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO:  index "cattleCode_index" now contains 24673 row versions in 17799 pages
DETAIL:  4763 index row versions were removed.
2795 index pages have been deleted, 2795 are currently reusable.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO:  index "corpId_index" now contains 24673 row versions in 37233 pages
DETAIL:  4763 index row versions were removed.
36745 index pages have been deleted, 36742 are currently reusable.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO:  index "farmId_index" now contains 24673 row versions in 37134 pages
DETAIL:  4763 index row versions were removed.
36654 index pages have been deleted, 36652 are currently reusable.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO:  index "feedId_index" now contains 24673 row versions in 37155 pages
DETAIL:  4763 index row versions were removed.
36595 index pages have been deleted, 36595 are currently reusable.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO:  index "manageCode_index" now contains 24673 row versions in 17764 pages
DETAIL:  4762 index row versions were removed.
2744 index pages have been deleted, 2744 are currently reusable.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO:  index "rfid_index" now contains 24673 row versions in 34752 pages
DETAIL:  4763 index row versions were removed.
19034 index pages have been deleted, 19034 are currently reusable.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO:  "t_cattle_ca_cattleinfo": found 753 removable, 24673 nonremovable row versions in 297054 out of 297054 pages
DETAIL:  0 dead row versions cannot be removed yet.
There were 4399722 unused item pointers.
0 pages are entirely empty.
CPU 1.17s/1.69u sec elapsed 2.92 sec.
INFO:  vacuuming "pg_toast.pg_toast_1962065"
INFO:  index "pg_toast_1962065_index" now contains 0 row versions in 1 pages
DETAIL:  0 index row versions were removed.
0 index pages have been deleted, 0 are currently reusable.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO:  "pg_toast_1962065": found 0 removable, 0 nonremovable row versions in 0 out of 0 pages
DETAIL:  0 dead row versions cannot be removed yet.
There were 0 unused item pointers.
0 pages are entirely empty.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO:  analyzing "cattle.t_cattle_ca_cattleinfo"
INFO:  "t_cattle_ca_cattleinfo": scanned 30000 of 297054 pages, containing 2630 live rows and 0 dead rows; 2630 rows in sample, 24811 estimated total rows
VACUUM

四 重新查看该SQL执行计划:

cattle=> explain analyze SELECT
cattle-> ci.*, fa.farm_name AS farmName,
cattle-> cs.cowshed_no AS cowshedNo
cattle-> FROM
cattle-> t_cattle_ca_cattleinfo AS ci
cattle-> LEFT JOIN t_cattle_crm_cust_info AS cci ON ci.owner_id = cci. ID
cattle-> LEFT JOIN t_cattle_bd_farm AS fa ON ci.farm_id = fa. ID
cattle-> LEFT JOIN t_cattle_bd_cowshed AS cs ON ci.dorm_id = cs. ID
cattle-> LEFT JOIN t_cattle_sys_company t1 ON ci.owner_id = t1. ID
cattle-> WHERE
cattle-> ci. FLAG = 1
cattle-> AND ci.corp_id = 'OAO_SYSTEMINTIAL_000000000000000'
cattle-> ORDER BY
cattle-> ci.create_time DESC
cattle-> LIMIT 10 OFFSET (1- 1);  
                                                                               QUERY PLAN                                                                               
------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=225802.81..225802.84 rows=10 width=766) (actual time=79.316..79.319 rows=10 loops=1)
   ->  Sort  (cost=225802.81..225864.84 rows=24811 width=766) (actual time=79.314..79.315 rows=10 loops=1)
         Sort Key: ci.create_time
         Sort Method: top-N heapsort  Memory: 35kB
         ->  Hash Left Join  (cost=149207.67..225266.65 rows=24811 width=766) (actual time=9.327..54.520 rows=24673 loops=1)
               Hash Cond: ((ci.dorm_id)::text = (cs.id)::text)
               ->  Hash Left Join  (cost=149128.63..224878.17 rows=24811 width=759) (actual time=8.693..42.217 rows=24673 loops=1)
                     Hash Cond: ((ci.farm_id)::text = (fa.id)::text)
                     ->  Bitmap Heap Scan on t_cattle_ca_cattleinfo ci  (cost=149124.82..224533.21 rows=24811 width=738) (actual time=8.649..22.418 rows=24673 loops=1)
                           Recheck Cond: ((corp_id)::text = 'OAO_SYSTEMINTIAL_000000000000000'::text)
                           Filter: ((flag)::text = '1'::text)
                           ->  Bitmap Index Scan on "corpId_index"  (cost=0.00..149118.62 rows=24811 width=0) (actual time=8.182..8.182 rows=24673 loops=1)
                                 Index Cond: ((corp_id)::text = 'OAO_SYSTEMINTIAL_000000000000000'::text)
                     ->  Hash  (cost=3.36..3.36 rows=36 width=54) (actual time=0.025..0.025 rows=36 loops=1)
                           Buckets: 1024  Batches: 1  Memory Usage: 4kB
                           ->  Seq Scan on t_cattle_bd_farm fa  (cost=0.00..3.36 rows=36 width=54) (actual time=0.005..0.011 rows=36 loops=1)
               ->  Hash  (cost=63.46..63.46 rows=1246 width=40) (actual time=0.617..0.617 rows=1246 loops=1)
                     Buckets: 1024  Batches: 1  Memory Usage: 89kB
                     ->  Seq Scan on t_cattle_bd_cowshed cs  (cost=0.00..63.46 rows=1246 width=40) (actual time=0.001..0.251 rows=1246 loops=1)
 Total runtime: 79.503 ms
(20 rows)

cattle=>

对比,收集统计信息之前,SQL执行耗时1500毫秒,更新统计信息之后,耗时79毫秒。
五  小结:

一则典型的数据库因为表的统计信息不准,导致优化器选择错误的执行计划的一个例子。

顶一下
(0)
100%
踩一下
(0)
100%

原创内容,转载时请保留原文信息:
文章标题:PostgreSQL统计信息不准导致错误的执行计划
↑上一篇↑:
↓下一篇↓:

发表评论

插入图片

NOTICE1:请申请gravatar头像,没有头像的评论可能不会被回复|头像相关帮助!