按特定列SQL移位操作分组

时间:2017-10-11 14:21:19

标签: sql greenplum

所以我在购买的大表名 GreenPlum 中包含超过400万行。以下是此表的示例:

userId |        purchaseTime      | timeDiff
------------------------------------------
 17    |   2016-02-01 11:01:02    |
 17    |   2016-02-01 13:24:58    |
 17    |   2016-02-01 21:12:36    |
 67    |   2016-02-01 17:04:49    |
 84    |   2016-02-01 16:13:20    |
 94    |   2016-02-01 05:46:13    |
 94    |   2016-02-01 21:33:19    |

该表按userID和purchaseTime排序,以帮助了解我的目标

我的目标是通过包含每个用户当前行和上次购买时间之间的时间差来更新此表。

看起来像这样:

userId |        purchaseTime      | timeDiff
------------------------------------------
 17    |   2016-02-01 11:01:02    | NULL
 17    |   2016-02-01 13:24:58    | 2:23:56
 17    |   2016-02-01 21:12:36    | 8:12:38
 67    |   2016-02-01 17:04:49    | NULL
 84    |   2016-02-01 16:13:20    | NULL
 94    |   2016-02-01 05:46:13    | NULL
 94    |   2016-02-01 21:33:19    | 16:13:06

您的一个答案中的选择帮助了我。现在我需要进行更新,但是我在UPDATE附近遇到语法错误:

WITH tmp_table AS
(
    SELECT userId ,  
       purchaseTime ,
       purchaseTime - LAG(purchaseTime )
       OVER (PARTITION BY userId  ORDER BY purchaseTime) AS timeDiff
    FROM   purchases
)

UPDATE purchases SET timeDiff = tmp_table.timeDiff
FROM tmp_table
WHERE userId   = tmp_table.userId  
AND purchaseTime = tmp_table.purchaseTime;

任何人都可以帮我更新我的桌子吗?

2 个答案:

答案 0 :(得分:1)

您可以使用"窗口功能查找上一个购买日期,只需减去两个:

lag

答案 1 :(得分:0)

基于@mureinik的查询,为了进行更新,您必须执行以下操作:

UPDATE purchases
SET timeDiff = tmp_table.timeDiff
FROM (SELECT userId, purchaseTime ,
       (EXTRACT(epoch FROM purchaseTime - LAG(purchaseTime) OVER 
           (PARTITION BY userId ORDER BY purchaseTime))/60)::integer AS timeDiff
        FROM   purchases) AS tmp_table
WHERE purchases.userId = tmp_table.userId
AND purchases.timeDiff = tmp_table.timeDiff;

在更新中,您将获得EXTRACTepoch FROM语句,以便返回间隔中的秒数。如果你想在几分钟内将它们除以60 \60,最后如果要将它舍入,只需将其转换为integer