仅针对特定列的Postgres distinct union

时间:2018-04-18 13:26:50

标签: postgresql union postgresql-9.4

我有两组数据,其中一组是动态生成的。

如果我离开列state它完全正常,因为该列确实不存在,我的问题是我如何忽略UNION的列,以便它组合两个数据集(因为它它和UNION ALL是一样的吗?例如,我更喜欢第一个表,并希望第二个数据集中的任何行都被忽略(如果它们存在于第一个数据集中)。

SELECT event_id, start_at, state
FROM event_logs
WHERE start_at BETWEEN current_date AND current_date + interval '3 weeks'
UNION
SELECT id event_id,
GENERATE_SERIES(date_trunc('week', current_date)::date + (extract(isodow from start_at)::int - 1) + start_at::time, current_date + interval '3 weeks', '1 week'::INTERVAL) AS start_at,
'draft' AS state
FROM events

更新,也尝试过:

WITH future_logs AS (
 SELECT id event_id,
 GENERATE_SERIES(date_trunc('week', current_date)::date + (extract(isodow from start_at)::int -  1) + start_at::time, current_date + interval '3 weeks', '1 week'::INTERVAL) AS start_at,
 'draft' AS state
 FROM events)

SELECT future_logs.event_id, future_logs.start_at, future_logs.state
FROM future_logs
LEFT JOIN event_logs ON future_logs.event_id = event_logs.event_id AND future_logs.start_at = event_logs.start_at
WHERE event_logs.start_at BETWEEN current_date AND current_date + interval '3 weeks'

但得到的结果太少了77对比预期的1000。

2 个答案:

答案 0 :(得分:1)

只需将NOT EXISTS()添加到第二站,您就可以使用UNION ALL来避免排序/合并。

SELECT event_id, start_at, state
FROM event_logs
WHERE start_at BETWEEN current_date AND current_date + interval '3 weeks'

UNION ALL

SELECT id AS event_id
        , generate_series(date_trunc('week', current_date)::date + (extract(isodow from start_at)::int - 1) + start_at::time
                , current_date + interval '3 weeks'
                , '1 week'::INTERVAL) AS start_at
        , 'draft' AS state
FROM events ev
WHERE NOT EXISTS ( SELECT*
        FROM event_logs nx
        WHERE nx.event_id =ev.id
        AND nx.start_at BETWEEN current_date AND current_date + interval '3 weeks'      )

        ;

答案 1 :(得分:0)

我会在您的UNION查询中添加另一列taborder,以确保对行进行简单排序,并按以下方式使用窗口函数row_number() over(...)

select event_id, start_at, state from ( select event_id, start_at, state, row_number(*) over (partition by event_id, start_at order by taborder) as rownum from ( select event_id, start_at, state, 1 as taborder from original_table union select event_id, start_at, state, 2 as taborder from draft_table ) src0 ) src1 where rownum=1 order by 1,2,3