我正在尝试提取一些按我们运营的市场分组的数据。表格结构如下:
bks:
opportunity_id
bks_opps:
opportunity_id | trip_start | state
bts:
boat_id | package_id
pckgs:
package_id | boat_id
addresses:
addressable_id | district_id
districts:
district_id
我想要做的是计算每个区的赢,输,总数和胜率。
SELECT d.name AS "District",
SUM(CASE WHEN bo.state IN ('won') THEN 1 ELSE 0 END) AS "Won",
SUM(CASE WHEN bo.state IN ('lost') THEN 1 ELSE 0 END) AS "Lost",
Count(bo.state) AS "Total",
Round(100 * SUM(CASE WHEN bo.state IN ('won') THEN 1 ELSE 0 END) / Count(bo.state)) AS "% Won"
FROM bks b
INNER JOIN bks_opps bo ON bo.id = b.opportunity_id
INNER JOIN pckgs p ON p.id = b.package_id
INNER JOIN bts bt ON bt.id = p.boat_id
INNER JOIN addresses a ON a.addressable_type = 'Boat' AND a.addressable_id = bt.id
INNER JOIN districts d ON d.id = a.district_id
WHERE bo.trip_start BETWEEN '2016-05-12' AND '2016-06-12'
GROUP BY d.name;
这会返回不正确的数据(值高于预期值)。但是,当我摆脱所有联接并停止按地区分组时 - 数字是正确的(计算机会的数量)。有谁能发现我做错了什么?这里最相关的问题是this一个问题。
示例数据:
District | won | lost | total
----+---------+---------+------
1 | 42 | 212 | 254
预期数据:
District | won | lost | total |
----+---------+---------+--
1 | 22 | 155 | 177
答案 0 :(得分:2)
格式化评论:
我猜想你的一个连接条件是错误的,但是提供的结构是不可能的。
例如,您有此加入INNER JOIN pckgs p ON p.id = b.package_id
,但package_id
未列为bks
中的列。
这些联接看起来特别可疑:
INNER JOIN pckgs p ON p.id = b.package_id
INNER JOIN bts bt ON bt.id = p.boat_id
如果一艘船可以存在多个包装,那将是一个问题。
要进行问题排查,请从最简单的查询开始:
SELECT b.opportunity_id
FROM bks b
然后单独选择,然后继续添加每个连接:
SELECT b.opportunity_id
FROM bks b
INNER JOIN pckgs p ON p.id = b.package_id
在某些时候,您可能会看到返回的行数跳跃。您最后添加的JOIN
是您的问题。