行号自定义编码在sql中

时间:2017-12-04 08:13:28

标签: sql google-bigquery window-functions standard-sql

我使用bigquery #standardsql来处理表格。该表将记录在第9个月和第10个月购买商品的用户的转化(1)。对于未在第10个月购买的用户,其行中只有0

到目前为止,这是custom_coded

的查询
(case when row_number() 
  over (partition by customer_id order by purchase_date asc) =
                  count(*) over (partition by customer_id)
             then 1 else 0 END) AS custom_coded

这是迄今为止的结果 result

我的期望是customer_id = 288 0只有custom_coded,因为他没有在下个月或第10个月购买。customer_id = 879预计会1 1}}在他最新的purchase_date中,因为他在第10个月有购买记录

这是预期的结果 expected result

我之前在这个帖子中询问过(Decode maximum number in rows for sql),但数据集不符合我要执行的分析的想法

1 个答案:

答案 0 :(得分:1)

以下是BigQuery Standard SQL

  
#standardSQL
SELECT customer_id, item_purchased, purchase_date, 
  (CASE WHEN 
    ROW_NUMBER() OVER (PARTITION BY customer_id ORDER BY purchase_date ASC) =
        COUNT(*) OVER (PARTITION BY customer_id)
    AND SUM(DISTINCT (CASE FORMAT_DATE('%Y%m', purchase_date) 
        WHEN '201709' THEN 1 WHEN '201710' THEN 2 ELSE 0 END)) 
        OVER(PARTITION BY customer_id) = 3
    THEN 1 ELSE 0 
  END) AS custom_coded
FROM `project.dataset.table`

您可以使用问题中的虚拟数据进行上述测试/播放

#standardSQL
WITH `project.dataset.table` AS (
  SELECT 288 customer_id, 'Rice' item_purchased, DATE '2017-09-02' purchase_date UNION ALL
  SELECT 288, 'Rice', DATE '2017-09-02' UNION ALL
  SELECT 288, 'Rice', DATE '2017-09-06' UNION ALL
  SELECT 879, 'Plate', DATE '2017-09-01' UNION ALL
  SELECT 879, 'Plate', DATE '2017-09-25' UNION ALL
  SELECT 879, 'Plate', DATE '2017-10-25' UNION ALL
  SELECT 879, 'Plate', DATE '2017-10-27' 
)
SELECT customer_id, item_purchased, purchase_date, 
  (CASE WHEN 
    ROW_NUMBER() OVER (PARTITION BY customer_id ORDER BY purchase_date ASC) =
        COUNT(*) OVER (PARTITION BY customer_id)
    AND SUM(DISTINCT (CASE FORMAT_DATE('%Y%m', purchase_date) 
        WHEN '201709' THEN 1 WHEN '201710' THEN 2 ELSE 0 END)) 
        OVER(PARTITION BY customer_id) = 3
    THEN 1 ELSE 0 
  END) AS custom_coded
FROM `project.dataset.table`
ORDER BY customer_id, purchase_date   

结果是

customer_id item_purchased  purchase_date   custom_coded     
288         Rice            2017-09-02      0    
288         Rice            2017-09-02      0    
288         Rice            2017-09-06      0    
879         Plate           2017-09-01      0    
879         Plate           2017-09-25      0    
879         Plate           2017-10-25      0    
879         Plate           2017-10-27      1