优化GreenplumDB查询

时间:2017-01-06 15:50:29

标签: query-optimization analytics greenplum bigdata

我在Greenplum数据库中运行一个包含4个段服务器的查询,每个服务器64GB,并且具有很强的处理能力。当我执行查询时,主表¨pic_isup_new¨有500个MM行,响应时间非常慢(100000行为300秒)。有没有办法优化查询或我应该寻找Greenplum配置?。

SELECT pic_isup_new.begin_time::date, date_part('hour', pic_isup_new.begin_time) AS begin_hour, pic_isup_new.cause_value, pic_isup_new.conversation, pic_isup_new.a_number, pic_isup_new.b_number, causas_liberacion.id_grupo, "substring"(pic_isup_new.b_number, '......$'::text) AS serie, 
    CASE
        WHEN length(pic_isup_new.b_number) = 7 THEN "substring"(pic_isup_new.b_number, 1, 1)
        WHEN length(pic_isup_new.b_number) = 8 THEN "substring"(pic_isup_new.b_number, 2, 1)
        WHEN length(pic_isup_new.b_number) = 10 AND "substring"(pic_isup_new.b_number, 1, 1) = '0'::text THEN "substring"(pic_isup_new.b_number, 4, 1)
        WHEN length(pic_isup_new.b_number) = 10 AND ("substring"(pic_isup_new.b_number, 1, 3) = '906'::text OR "substring"(pic_isup_new.b_number, 1, 3) = '903'::text) AND length(pic_isup_new.a_number) = 8 THEN "substring"(pic_isup_new.b_number, 4, 1)
        ELSE ''::text
    END AS prefijo, 
    CASE
        WHEN length(pic_isup_new.b_number) = 7 THEN centrales.codigo_area
        WHEN length(pic_isup_new.b_number) = 8 THEN "substring"(pic_isup_new.b_number, 1, 1)
        WHEN length(pic_isup_new.b_number) = 10 AND "substring"(pic_isup_new.b_number, 1, 1) = '0'::text THEN "substring"(pic_isup_new.b_number, 3, 1)
        WHEN length(pic_isup_new.b_number) = 10 AND ("substring"(pic_isup_new.b_number, 1, 3) = '906'::text OR "substring"(pic_isup_new.b_number, 1, 3) = '903'::text) AND length(pic_isup_new.a_number) = 8 THEN centrales.codigo_area
        ELSE ''::text
    END AS codigo_area, series.serie AS serie_id, departamento.departamento FROM ndm_comovi.pic_isup_new JOIN ndm_comovi.causas_liberacion ON causas_liberacion.id::text = pic_isup_new.cause_value JOIN ndm_comovi.centrales centrales ON pic_isup_new.dpc_decimal = centrales.pointcode_decimal JOIN ndm_comovi.series ON ((CASE
        WHEN length(pic_isup_new.b_number) = 7 THEN centrales.codigo_area
        WHEN length(pic_isup_new.b_number) = 8 THEN "substring"(pic_isup_new.b_number, 1, 1)
        WHEN length(pic_isup_new.b_number) = 10 AND "substring"(pic_isup_new.b_number, 1, 1) = '0'::text THEN "substring"(pic_isup_new.b_number, 3, 1)
        WHEN length(pic_isup_new.b_number) = 10 AND ("substring"(pic_isup_new.b_number, 1, 3) = '906'::text OR "substring"(pic_isup_new.b_number, 1, 3) = '903'::text) AND length(pic_isup_new.a_number) = 8 THEN centrales.codigo_area
        ELSE ''::text
    END) = series.codigo_area AND (CASE
        WHEN length(pic_isup_new.b_number) = 7 THEN "substring"(pic_isup_new.b_number, 1, 1)
        WHEN length(pic_isup_new.b_number) = 8 THEN "substring"(pic_isup_new.b_number, 2, 1)
        WHEN length(pic_isup_new.b_number) = 10 AND "substring"(pic_isup_new.b_number, 1, 1) = '0'::text THEN "substring"(pic_isup_new.b_number, 4, 1)
        WHEN length(pic_isup_new.b_number) = 10 AND ("substring"(pic_isup_new.b_number, 1, 3) = '906'::text OR "substring"(pic_isup_new.b_number, 1, 3) = '903'::text) AND length(pic_isup_new.a_number) = 8 THEN "substring"(pic_isup_new.b_number, 4, 1)
        ELSE ''::text
    END) = series.prefijo AND "substring"(pic_isup_new.b_number, '......$'::text) >= series.serie_inicial AND "substring"(pic_isup_new.b_number, '......$'::text) <= series.serie_final)
JOIN ndm_comovi.departamento ON series.serie = departamento.serie

我在这里添加解释计划

QUERY PLAN  Gather Motion 8:1  (slice5; segments: 8)  (cost=825.59..11841.02 rows=5 width=157)  ->  Hash Join  (cost=825.59..11841.02 rows=1 width=157) 
    Hash Cond: ndm_comovi.pic_isup_new.dpc_decimal = centrales.pointcode_decimal    
    Join Filter: CASE WHEN length(ndm_comovi.pic_isup_new.b_number) = 7 THEN centrales.codigo_area WHEN length(ndm_comovi.pic_isup_new.b_number) = 8 THEN "substring"(ndm_comovi.pic_isup_new.b_number, 1, 1) WHEN length(ndm_comovi.pic_isup_new.b_number) = 10 AND "substring"(ndm_comovi.pic_isup_new.b_number, 1, 1) = '0'::text THEN "substring"(ndm_comovi.pic_isup_new.b_number, 3, 1) WHEN length(ndm_comovi.pic_isup_new.b_number) = 10 AND ("substring"(ndm_comovi.pic_isup_new.b_number, 1, 3) = '906'::text OR "substring"(ndm_comovi.pic_isup_new.b_number, 1, 3) = '903'::text) AND length(ndm_comovi.pic_isup_new.a_number) = 8 THEN centrales.codigo_area ELSE ''::text END = series.codigo_area    
    ->  Redistribute Motion 8:8  (slice4; segments: 8)  (cost=601.34..11615.38 rows=2 width=161)    
          Hash Key: ndm_comovi.pic_isup_new.dpc_decimal 
          ->  Hash Join  (cost=601.34..11615.06 rows=2 width=161)   
                Hash Cond: CASE WHEN length(ndm_comovi.pic_isup_new.b_number) = 7 THEN "substring"(ndm_comovi.pic_isup_new.b_number, 1, 1) WHEN length(ndm_comovi.pic_isup_new.b_number) = 8 THEN "substring"(ndm_comovi.pic_isup_new.b_number, 2, 1) WHEN length(ndm_comovi.pic_isup_new.b_number) = 10 AND "substring"(ndm_comovi.pic_isup_new.b_number, 1, 1) = '0'::text THEN "substring"(ndm_comovi.pic_isup_new.b_number, 4, 1) WHEN length(ndm_comovi.pic_isup_new.b_number) = 10 AND ("substring"(ndm_comovi.pic_isup_new.b_number, 1, 3) = '906'::text OR "substring"(ndm_comovi.pic_isup_new.b_number, 1, 3) = '903'::text) AND length(ndm_comovi.pic_isup_new.a_number) = 8 THEN "substring"(ndm_comovi.pic_isup_new.b_number, 4, 1) ELSE ''::text END = series.prefijo  
                Join Filter: "substring"(ndm_comovi.pic_isup_new.b_number, '......$'::text) >= series.serie_inicial AND "substring"(ndm_comovi.pic_isup_new.b_number, '......$'::text) <= series.serie_final    
                ->  Hash Join  (cost=14.20..10945.25 rows=558 width=144)    
                      Hash Cond: ndm_comovi.pic_isup_new.cause_value = causas_liberacion.id::text   
                      ->  Append  (cost=0.00..10348.36 rows=17985 width=140)    
                            ->  Seq Scan on pic_isup_new_1_prt_1_2_prt_day1 pic_isup_new  (cost=0.00..0.00 rows=1 width=140)    
                            ->  Seq Scan on pic_isup_new_1_prt_1_2_prt_day2 pic_isup_new  (cost=0.00..0.00 rows=1 width=140)    
                            ->  Seq Scan on pic_isup_new_1_prt_1_2_prt_day3 pic_isup_new  (cost=0.00..0.00 rows=1 width=140)    
                            ->  Seq Scan on pic_isup_new_1_prt_1_2_prt_day4 pic_isup_new  (cost=0.00..0.00 rows=1 width=140)    
                            ->  Seq Scan on pic_isup_new_1_prt_1_2_prt_day5 pic_isup_new  (cost=0.00..0.00 rows=1 width=140)    
…
                      ->  Hash  (cost=11.10..11.10 rows=31 width=8) 
                            ->  Broadcast Motion 8:8  (slice1; segments: 8)  (cost=0.00..11.10 rows=31 width=8) 
                                  ->  Seq Scan on causas_liberacion  (cost=0.00..8.31 rows=4 width=8)   
                ->  Hash  (cost=583.94..583.94 rows=32 width=31)    
                      ->  Broadcast Motion 8:8  (slice3; segments: 8)  (cost=9.36..583.94 rows=32 width=31) 
                            ->  Hash Join  (cost=9.36..581.06 rows=4 width=31)  
                                  Hash Cond: series.serie = departamento.serie  
                                  ->  Seq Scan on series  (cost=0.00..488.44 rows=4143 width=22)    
                                  ->  Hash  (cost=8.96..8.96 rows=4 width=41)   
                                        ->  Redistribute Motion 8:8  (slice2; segments: 8)  (cost=0.00..8.96 rows=4 width=41)   
                                              Hash Key: departamento.serie  
                                              ->  Seq Scan on departamento  (cost=0.00..8.32 rows=4 width=41)   
    ->  Hash  (cost=121.89..121.89 rows=1024 width=6)   
          ->  Seq Scan on centrales  (cost=0.00..121.89 rows=1024 width=6)  

设置:effective_cache_size = 48GB; optimizer = on

2 个答案:

答案 0 :(得分:0)

您能加入解释计划吗?这是遗留规划器还是关键查询优化器?感谢。

V

答案 1 :(得分:0)

字符串函数可能会对某些查询造成拖累 - 但是我对你的陈述记忆更加好奇。 解释计划将显示您正在使用的规划器,以及分配和想要的内存量。

如果您使用的是默认的128mb statement_mem,那么您可能会在磁盘上做很多工作,这是数据库性能杀手。