我有以下表格,其中包含来自BigQuery中GA的数据
userid visitid purchase_date
GH8932 12345 2017-04-09
GH8932 12346 null
GH8932 12347 null
GH8932 12348 null
GH8932 12349 2017-05-30
GH8932 12350 null
GH8932 12351 null
GH8932 12352 2017-06-07
GH8932 12353 null
GH8932 12354 2017-06-30
GH8932 12355 null
GH8932 12356 null
我想用purchase_date填充所有空值。
我使用的当前查询(如下所示)
SELECT
userid,
visitid,
FIRST_VALUE(purchase_date IGNORE NULLS) OVER (
PARTITION BY userid ORDER BY visitid
ROWS BETWEEN CURRENT ROW AND
UNBOUNDED FOLLOWING) AS purchase_date
FROM x;
给我这样的东西
userid visitid purchase_date
GH8932 12345 2017-04-09
GH8932 12346 2017-05-30
GH8932 12347 2017-05-30
GH8932 12348 2017-05-30
GH8932 12349 2017-05-30
GH8932 12350 2017-06-07
GH8932 12351 2017-06-07
GH8932 12352 2017-06-07
GH8932 12353 2017-06-30
GH8932 12354 2017-06-30
GH8932 12355 null
GH8932 12356 null
关于如何用最终的purchase_date填充最后2个空值的任何建议?
答案 0 :(得分:1)
以下是BigQuery Standard SQL
#standardSQL
SELECT
userid,
visitid,
IFNULL(FIRST_VALUE(purchase_date IGNORE NULLS)
OVER (PARTITION BY userid ORDER BY visitid
ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING),
FIRST_VALUE(purchase_date IGNORE NULLS)
OVER (PARTITION BY userid ORDER BY visitid DESC
ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING)) AS purchase_date
FROM `project.dataset.table`
您可以使用问题中的虚拟数据进行上述测试/播放
#standardSQL
WITH `project.dataset.table` AS (
SELECT 'GH8932' userid, 12345 visitid, '2017-04-09' purchase_date UNION ALL
SELECT 'GH8932', 12346, NULL UNION ALL
SELECT 'GH8932', 12347, NULL UNION ALL
SELECT 'GH8932', 12348, NULL UNION ALL
SELECT 'GH8932', 12349, '2017-05-30' UNION ALL
SELECT 'GH8932', 12350, NULL UNION ALL
SELECT 'GH8932', 12351, NULL UNION ALL
SELECT 'GH8932', 12352, '2017-06-07' UNION ALL
SELECT 'GH8932', 12353, NULL UNION ALL
SELECT 'GH8932', 12354, '2017-06-30' UNION ALL
SELECT 'GH8932', 12355, NULL UNION ALL
SELECT 'GH8932', 12356, NULL
)
SELECT
userid,
visitid,
IFNULL(FIRST_VALUE(purchase_date IGNORE NULLS)
OVER (PARTITION BY userid ORDER BY visitid
ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING),
FIRST_VALUE(purchase_date IGNORE NULLS)
OVER (PARTITION BY userid ORDER BY visitid DESC
ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING)) AS purchase_date
FROM `project.dataset.table`
ORDER BY userid, visitid