加入多个表会导致错误的结果

时间:2016-06-14 19:29:25

标签: sql postgresql

我正在尝试提取一些按我们运营的市场分组的数据。表格结构如下:

bks:
opportunity_id 

bks_opps:
opportunity_id | trip_start | state

bts:
boat_id | package_id

pckgs:
package_id | boat_id

addresses:
addressable_id | district_id

districts:
district_id

我想要做的是计算每个区的赢,输,总数和胜率。

SELECT          d.name AS "District",
                SUM(CASE WHEN bo.state IN ('won') THEN 1 ELSE 0 END) AS "Won",
                SUM(CASE WHEN bo.state IN ('lost') THEN 1 ELSE 0 END) AS "Lost",
                Count(bo.state) AS "Total",
                Round(100 * SUM(CASE WHEN bo.state IN ('won') THEN 1 ELSE 0 END) / Count(bo.state)) AS "% Won"              
FROM bks b
INNER JOIN bks_opps bo ON bo.id = b.opportunity_id
INNER JOIN pckgs p ON p.id = b.package_id
INNER JOIN bts bt ON bt.id = p.boat_id
INNER JOIN addresses a ON a.addressable_type = 'Boat' AND a.addressable_id = bt.id
INNER JOIN districts d ON d.id = a.district_id
WHERE bo.trip_start BETWEEN '2016-05-12' AND '2016-06-12'
GROUP BY d.name;

这会返回不正确的数据(值高于预期值)。但是,当我摆脱所有联接并停止按地区分组时 - 数字是正确的(计算机会的数量)。有谁能发现我做错了什么?这里最相关的问题是this一个问题。

示例数据:

 District | won | lost | total 
----+---------+---------+------

  1 |       42 |    212 |   254

预期数据:

District | won | lost | total |
       ----+---------+---------+--
        1 |  22 |    155 |   177

1 个答案:

答案 0 :(得分:2)

格式化评论:

我猜想你的一个连接条件是错误的,但是提供的结构是不可能的。

例如,您有此加入INNER JOIN pckgs p ON p.id = b.package_id,但package_id未列为bks中的列。

这些联接看起来特别可疑:

INNER JOIN pckgs p ON p.id = b.package_id
INNER JOIN bts bt ON bt.id = p.boat_id

如果一艘船可以存在多个包装,那将是一个问题。

要进行问题排查,请从最简单的查询开始:

SELECT b.opportunity_id
FROM   bks b

然后单独选择,然后继续添加每个连接:

SELECT b.opportunity_id
FROM   bks b
INNER JOIN pckgs p ON p.id = b.package_id

在某些时候,您可能会看到返回的行数跳跃。您最后添加的JOIN是您的问题。