在同一个表上具有聚合的多个左连接会在SAP HANA中造成巨大的性能损失

时间:2017-04-06 06:31:45

标签: sap hana

我正在HANA上加入两个表格,为了获得一些统计数据,我将LEFT加入项目表3次,以获得总计数,处理的条目数和错误数量,如下所示。

这是一个开发系统,而items表只有1500个项目。但是下面的查询运行了17秒。

当我删除三个聚合术语中的任何一个(但保留相应的JOIN)时,查询几乎立即执行。

我也尝试在特定JOIN中使用的字段上添加索引,但这没有区别。

select rk.guid, rk.run_id, rk.status, rk.created_at, rk.created_by, 
count( distinct rp.guid ), 
count( distinct rp2.guid ), 
count( distinct rp3.guid )
    from zbsbpi_rk as rk
    left join zbsbpi_rp as rp
      on rp.header = rk.guid
    left join zbsbpi_rp as rp2
      on rp2.header = rk.guid
     and rp2.processed = 'X'
    left join zbsbpi_rp as rp3
      on rp3.header = rk.guid
     and rp3.result_status = 'E'
    where rk.run_id = '0000000010'
    group by rk.guid, run_id, status, created_at, created_by

2 个答案:

答案 0 :(得分:0)

我认为您可以重新编写查询以提高性能:

select rk.guid, rk.run_id, rk.status, rk.created_at, rk.created_by, 
count( distinct rp.guid ), 
count( distinct (CASE WHEN rp.processed = 'X' then rp.guid else null end) ), 
count( distinct (CASE WHEN rp.result_status = 'E' then rp.guid else null end))
    from zbsbpi_rk as rk
    left join zbsbpi_rp as rp
      on rp.header = rk.guid
where rk.run_id = '0000000010'
    group by rk.guid, run_id, status, created_at, created_by

我不完全确定计数不同的案例构造是否适用于hana,但您可以尝试。

答案 1 :(得分:0)

道歉,但我忘记了我在这里发布了这个问题。我在answers.sap.com上发布了同样的问题,但没有得到任何快乐:https://answers.sap.com/questions/172096/multiple-left-joins-with-aggregation-on-same-table.html

我最终提出了解决方案,这有点像#34; doh!"时刻:

  select rk.guid, rk.run_id, rk.status, rk.created_at, rk.created_by,
    count( distinct rp.guid ), 
    count( distinct rp2.guid ), 
    count( distinct rp3.guid )
    from zbsbpi_rk as rk
    join zbsbpi_rp as rp
      on rp.header = rk.guid
    left join zbsbpi_rp as rp2
      on rp2.guid = rp.guid
     and rp2.processed = 'X'
    left join zbsbpi_rp as rp3
      on rp3.guid = rp.guid
     and rp3.result_status = 'E'
    where rk.run_id = '0000000010'
    group by rk.guid, run_id, status, created_at, created_by

后续的左连接只需要连接到同一个表上的第一个连接,因为第一个连接包含所有记录的超集。