PostgreSQL按最近日期加入两个表

时间:2016-03-03 15:47:59

标签: postgresql

我有一个包含日期和结果的大型单一发送电子邮件表,我希望能够将每一行与上次发送电子邮件和特定结果(此处为open = 1)进行匹配。这需要使用PostgreSQL完成。例如:

初始表:

id  | sent_dt       | bounced   | open  `   | clicked | unsubscribe
1   | 2015-01-01    | 1         |      0    | 0       | 0
1   | 2015-01-02    | 0         |      1    | 1       | 0
1   | 2015-01-03    | 0         |      1    | 1       | 0
2   | 2015-01-01    | 0         |      1    | 0       | 0
2   | 2015-01-02    | 1         |      0    | 0       | 0
2   | 2015-01-03    | 0         |      1    | 0       | 0
2   | 2015-01-04    | 0         |      1    | 0       | 1

结果表:

id  | sent_dt       | bounced| open | clicked   | unsubscribe| previous_time
1   | 2015-01-01    | 1      | 0    | 0         | 0          | NULL
1   | 2015-01-02    | 0      | 1    | 1         | 0          | NULL
1   | 2015-01-03    | 0      | 1    | 1         | 0          | 2015-01-02
2   | 2015-01-01    | 0      | 1    | 0         | 0          | NULL
2   | 2015-01-02    | 1      | 0    | 0         | 0          | 2015-01-01
2   | 2015-01-03    | 0      | 1    | 0         | 0          | 2015-01-01
2   | 2015-01-04    | 0      | 1    | 0         | 1          | 2015-01-03

我尝试过使用Lag,但我不知道如何处理这个条件,open需要等于1,同时仍然返回所有行。我还尝试在id上执行多对多Join,然后找到最小Datediff,但这基本上与我的表的大小相同,并且计算时间太长(> 7小时)。有几个答案适用于SQL,但我看不到PostgreSQL的工作。

感谢您的帮助!

2 个答案:

答案 0 :(得分:0)

您可以使用ROW_NUMBER()来实现此期望的结果,如果它已打开= 1,则将每个结果连接到之前发生的结果。

SELECT t.*,s.sent_dt
FROM
      (SELECT p.*,
              ROW_NUMBER() OVER(PARTITION BY ID ORDER BY sent_dt DESC) rnk
      FROM YourTable p) t
LEFT OUTER JOIN
     (SELECT p.*,
             ROW_NUMBER() OVER(PARTITION BY ID ORDER BY sent_dt DESC) rnk
      FROM YourTable p) s
ON(t.rnk = s.rnk-1 AND s.open = 1)

答案 1 :(得分:0)

首先,我为邮件打开的日期创建了一个cte openFilter

然后我使用这些过滤器加入表邮件并获取该电子邮件之前的日期。最后过滤每个人都执行最新的打开邮件。

<强> SQL Fiddle Demo

WITH openFilter as (
    SELECT m."id", m."sent_dt"
    FROM mail m
    WHERE "open" = 1
)
SELECT m."id",
       to_char(m."sent_dt", 'YYYY-MM-DD'),
       "bounced", "open", "clicked", "unsubscribe",
       to_char(o."sent_dt", 'YYYY-MM-DD') previous_time
FROM mail m
LEFT JOIN openFilter o
       ON m."id" = o."id"
      AND m."sent_dt" >  o."sent_dt"     
WHERE o."sent_dt" = (SELECT MAX(t."sent_dt")
                     FROM openFilter t
                     WHERE t."id" = m."id"
                       AND t."sent_dt" < m."sent_dt")
   OR o."sent_dt" IS NULL

<强>输出

| id |    to_char | bounced | open | clicked | unsubscribe | previous_time |
|----|------------|---------|------|---------|-------------|---------------|
|  1 | 2015-01-01 |       1 |    0 |       0 |           0 |        (null) |
|  1 | 2015-01-02 |       0 |    1 |       1 |           0 |        (null) |
|  1 | 2015-01-03 |       0 |    1 |       1 |           0 |    2015-01-02 |
|  2 | 2015-01-01 |       0 |    1 |       0 |           0 |        (null) |
|  2 | 2015-01-02 |       1 |    0 |       0 |           0 |    2015-01-01 |
|  2 | 2015-01-03 |       0 |    1 |       0 |           0 |    2015-01-01 |
|  2 | 2015-01-04 |       0 |    1 |       0 |           1 |    2015-01-03 |