在单个SQL请求中连接来自不同表的聚合结果

时间:2014-11-14 20:16:29

标签: sql postgresql join

我需要确定在SQL请求中加入的不同表上进行多次聚合的最佳方法。

考虑以下表格来管理酒店,客房服务规划,客户提示和客户评估:

Room
 id | name 
----+------
  1 | 101
  2 | 102
  3 | 103
  4 | 201
  5 | 202
  6 | 203

housekeeper
 id |   name   | age 
----+----------+-----
  1 | John     |  48
  2 | Veronica |  25
  3 | Bob      |  19

room_service_planning
 id |    date    | room_id | housekeeper_id 
----+------------+---------+----------------
  1 | 2014-11-01 |       3 |              2
  2 | 2014-11-01 |       1 |              2
  3 | 2014-11-02 |       5 |              1

tips
 id | amount | housekeeper_id 
----+--------+----------------
  1 | 5,00 € |              1
  2 | 2,00 € |              3
  3 | 2,00 € |              1
  4 | 3,00 € |              3


client_eval
 id | good_eval | housekeeper_id 
----+-----------+----------------
  1 | t         |              1
  2 | f         |              1
  3 | t         |              2
  4 | t         |              2

经理想知道每个计划的客房服务,分配给它的人,技巧总和,客户评估的数量以及该员工在其职业生涯中获得的积极客户评价的数量。

在2014-11-01和2014-11-02之间寻找客房服务时的预期结果如下:

 id |    date    | room_id | housekeeper_id |  hk_name | hk_tips_sum | hk_tot_eval | hk_pos_eval 
----+------------+---------+----------------+----------+-------------+-------------+-------------
  1 | 2014-11-01 |       3 |              2   Veronica                           2             2
  2 | 2014-11-01 |       1 |              2   Veronica                           2             2
  3 | 2014-11-02 |       5 |              1       John        7,00 €             2             1

我探索的解决方案得到了这个结果:

解决方案1 ​​

SELECT temp2.id as id, temp2.date as date, temp2.room_id as room_id, 
temp2.housekeeper_id as housekeeper_id, temp2.hk_name as hk_name, 
temp2.hk_tips_sum as hk_tips_sum, temp2.hk_tot_eval as hk_tot_eval, 
count(1) as hk_post_eval
FROM

    (
    SELECT temp.id as id, temp.date as date, temp.room_id as room_id, 
    temp.housekeeper_id as housekeeper_id, temp.hk_name as hk_name, 
    temp.hk_tips_sum as hk_tips_sum, count(1) as hk_tot_eval

    FROM

        (SELECT rsp.id as id, rsp.date as date, rsp.room_id as room_id, 
        rsp.housekeeper_id as housekeeper_id, hk.name as hk_name, 
        sum(t.amount) as hk_tips_sum
        FROM room_service_planning rsp
        INNER JOIN housekeeper hk 
            ON rsp.date>='2014-11-01' 
            AND rsp.date<='2014-11-02' 
            AND hk.id=rsp.housekeeper_id
        LEFT JOIN tips t
            ON t.housekeeper_id=hk.id
        GROUP BY rsp.id, rsp.date, rsp.room_id, rsp.housekeeper_id, hk_name
        ) temp

    LEFT JOIN client_eval ce_tot
        ON ce_tot.housekeeper_id=temp.housekeeper_id

    GROUP BY temp.id, temp.date, temp.room_id, temp.housekeeper_id, 
        temp.hk_name, temp.hk_tips_sum

    ) temp2

LEFT JOIN client_eval ce_pos
    ON ce_pos.housekeeper_id=temp2.housekeeper_id
    AND ce_pos.good_eval='t'

GROUP BY temp2.id, temp2.date, temp2.room_id, temp2.housekeeper_id, 
temp2.hk_name, temp2.hk_tips_sum, temp2.hk_tot_eval; 

注意:这是基于&#34;分组来聚合&#34;然后&#34;加入下一张桌子&#34;然后&#34;分组聚合&#34;然后&#34;加入下一张桌子&#34;等等... 。 这是有效但写得很重,难以阅读。我对此解决方案不满意。

解决方案2:

SELECT rsp.id as id, rsp.date as date, rsp.room_id as room_id,
rsp.housekeeper_id as houkeeper_id, hk.name as hk_name, 
t.amount as hk_tips_sum, ce_tot.hk_tot_eval as hk_tot_eval, 
ce_pos.hk_pos_eval as hk_pos_eval
FROM room_service_planning rsp
INNER JOIN housekeeper hk 
    ON rsp.date>='2014-11-01' 
    AND rsp.date<='2014-11-02' 
    AND hk.id=rsp.housekeeper_id
LEFT JOIN 
    (SELECT housekeeper_id, sum(amount) as amount 
     FROM tips
     GROUP BY housekeeper_id) t
    ON t.housekeeper_id=hk.id
LEFT JOIN
    (SELECT housekeeper_id, count(1) as hk_tot_eval
     FROM client_eval
     GROUP BY housekeeper_id) ce_tot
    ON ce_tot.housekeeper_id=hk.id
LEFT JOIN
    (SELECT housekeeper_id, count(good_eval) as hk_pos_eval
     FROM client_eval
     WHERE good_eval='t'
     GROUP BY housekeeper_id) ce_pos
    ON ce_pos.housekeeper_id=hk.id;

注意:这个解决方案更具可读性,但我想知道会发生什么变化才会增加记录的数量,以及#39;提示&#39;或者&#39; client_eval&#39;。让我们想象一下,酒店里有数以百万计的小费和数百万的客户评价。 这意味着我们将花费数百万的金额和数量,然后我们将只选择所需的几个。这是浪费资源,可能导致很长时间的延误。

结论: 尽管我找到了两种不同的方法来实现我的目标,但我对它们并不满意。

您可以建议哪种更智能,更有效的解决方案来解决这个问题?

2 个答案:

答案 0 :(得分:1)

我在这里看到了几件事。我暂时没有真正认真地使用postgresql,所以我希望我不会完全偏离基础。

如果您只想获取与管家相关联的汇总信息以加入每日计划,您可能需要考虑创建一个封装管家ID,总提示,总逃避和总正面逃避的视图。这应该利用服务器级别的任何缓存并减少必要的函数调用次数。

如果您只想获取所需管家的信息,可以在查询中进行子选择:

SELECT 
    rsp.id as id, rsp.date as date, rsp.room_id as room_id,
    rsp.housekeeper_id as housekeeper_id, hk.name as hk_name, 
        (SELECT SUM(t.amount) from tips where housekeeper_id = rsp.housekeeper_id) as hk_tips_sum,
        (SELECT COUNT(1) from client_eval where housekeeper_id = rsp.housekeeper_id) as hk_eval_count,
        (SELECT COUNT(1) from client_eval where housekeeper_id = rsp.housekeeper_id and good_eval='t') as hk_positive_eval_count
   FROM room_service_planning rsp
       INNER JOIN housekeeper hk 
            ON rsp.date>='2014-11-01' 
            AND rsp.date<='2014-11-02' 
            AND hk.id=rsp.housekeeper_id

如果视图过度,那只会计算他们需要的聚合。

最后,提示/评估的转变是否重要?如果您不知道哪位客人提供了小费/评论,那么外国人可以将它们键入room_service_planning表而不是管家表,或者除了管家表之外。

答案 1 :(得分:0)

试试这个:

select * FROM (
      room_service_planning rsp
      INNER JOIN housekeeper hk on hk.id=rsp.housekeeper_id
      ) 
left join (SELECT housekeeper_id,SUM(amount) hk_tips_sum from 
               tips group by 1) tips using (housekeeper_id)
left join (SELECT housekeeper_id,COUNT(*) hk_eval_count,
                  count(NULLIF(good_eval,false)) hk_positive_eval_count 
           from client_eval group by 1) evals  using (housekeeper_id)
where rsp.date>='2014-11-01' and rsp.date<='2014-11-02'