PostgreSQL结果集长度因连接而异

时间:2014-12-09 09:29:14

标签: sql postgresql outer-join

环境是:PostgreSQL 9.3

我有一个请求:

SELECT
  DISTINCT
  orders.id                      AS count,
  orders.state                   AS state,
  coalesce(discounts.amount, 0)  AS discount,
  CASE WHEN (order_logs.id IS NULL) THEN 0
  ELSE 1 END                     AS attempted,
  CASE WHEN (order_logs_1.id IS NULL) THEN 0
  ELSE 1 END                     AS processed,

-- the following 2-by-2 lines are comment-swapped
--   users.name                     AS "group",
--   order_logs_2.operator_id       AS group_raw
  goods.name                     AS "group",
  goods.id                       AS group_raw
FROM
  orders
  LEFT OUTER JOIN
  order_logs
    ON
      order_logs.order_id = orders.id
      AND
      order_logs.state <= 1999
      AND
      order_logs.state != 1000
  LEFT OUTER JOIN
  order_logs AS order_logs_1
    ON
      order_logs_1.order_id = orders.id
      AND
      order_logs_1.state <= 1999
      AND
      order_logs_1.state != 1000
      AND
      (order_logs_1.flags & 1) > 0
  LEFT OUTER JOIN
  discounts
    ON
      discounts.user_id = orders.user_id
      AND
      discounts.goods_id = orders.goods_id

--the following 2 joins refer to the first commented-out lines in SELECT expression list
--   LEFT OUTER JOIN
--   order_logs AS order_logs_2
--     ON
--       order_logs_2.order_id = orders.id
--       AND
--       order_logs_2.state IN (
--         1999,
--         1100,
--         1150,
--         1151,
--         1003,
--         1202,
--         1203,
--         1200,
--         1201,
--         1002
--       )
--       AND
--       (order_logs_2.flags & 1 > 0)
--   LEFT OUTER JOIN
--   users
--     ON
--       users.id = order_logs_2.operator_id
  -- this join refers to the last 2 lines in expression list
  LEFT OUTER JOIN
  goods
      ON
          goods.id = orders.goods_id
WHERE
  orders.ts_spawn >= 1414789200
  AND
  orders.ts_spawn < 1417381200
ORDER BY "count"

问题是:当我使用基于商品的分组时,查询返回结果集的20k行(与SELECT count(*) FROM orders WHERE ts_spawn >= 1414789200 AND ts_spawn < 1417381200相同)。但是当我使用基于operator_id的分组时,结果集包含更多(约+ 4k)行。这怎么可能?当一个更改连接的表/列而不是查询的基表时,为什么结果集会增长?

此外,我尝试使用

删除表格users的使用情况
order_logs_2.operator_id       AS "group",
order_logs_2.operator_id       AS group_raw

然后排除相应的JOIN子句(最后一个注释掉的)没有运气。

0 个答案:

没有答案