随着时间的推移跟踪订单的BigQuery结构

时间:2016-05-12 13:00:00

标签: google-bigquery

刚刚开始使用BigQuery(以及一般的大数据/ BI),并且只关注如何跟踪订单,而不仅仅是追加。

我们说我有一个在线商店/ ecomm网站和相关数据。我们出售小部件。

当我的订单通过"创建",到"购物",到"已完成"到"确认" (由供应商)到"履行" /"已发货"有时被取消" /"拒绝" (按供应商),我如何在BigQuery中考虑到这一点,以便我可以为"已创建但未完成的订单构建可视化","在日期范围内履行的订单" (以某种方式能够解释那些已完成但在完成后被供应商取消/拒绝的那些)

我会使用"创建订单"填充不同的表格吗?和"履行订单"和"取消订单"或者是否有一些其他机制可以解释这一点,因为我无法更新行(从已完成更改为已取消)。

提前谢谢。

2 个答案:

答案 0 :(得分:2)

以下是一个如何解析事件表以获取有意义信息的示例:
此示例使用BigQuery Standard SQL,因此您需要取消选中显示选项

下的Use Legacy SQL复选框
WITH order_events AS (
  SELECT 1 AS orderID, '2015-01-01' AS ts, 'created' AS event UNION ALL
  SELECT 1 AS orderID, '2015-01-01' AS ts, 'shopping' AS event UNION ALL
  SELECT 1 AS orderID, '2015-01-02' AS ts, 'completed' AS event UNION ALL
  SELECT 1 AS orderID, '2015-01-03' AS ts, 'confirmed' AS event UNION ALL
  SELECT 1 AS orderID, '2015-01-04' AS ts, 'shipped' AS event UNION ALL
  SELECT 2 AS orderID, '2015-01-01' AS ts, 'created' AS event UNION ALL
  SELECT 2 AS orderID, '2015-01-01' AS ts, 'shopping' AS event UNION ALL
  SELECT 2 AS orderID, '2015-01-02' AS ts, 'completed' AS event UNION ALL
  SELECT 2 AS orderID, '2015-01-03' AS ts, 'declined' AS event UNION ALL
  SELECT 3 AS orderID, '2015-01-01' AS ts, 'created' AS event UNION ALL
  SELECT 3 AS orderID, '2015-01-01' AS ts, 'shopping' AS event UNION ALL
  SELECT 3 AS orderID, '2015-01-02' AS ts, 'completed' AS event UNION ALL
  SELECT 3 AS orderID, '2015-01-03' AS ts, 'confirmed' AS event UNION ALL
  SELECT 3 AS orderID, '2015-01-04' AS ts, 'shipped' AS event UNION ALL
  SELECT 4 AS orderID, '2015-01-01' AS ts, 'created' AS event UNION ALL
  SELECT 4 AS orderID, '2015-01-01' AS ts, 'shopping' AS event UNION ALL
  SELECT 4 AS orderID, '2015-01-02' AS ts, 'completed' AS event UNION ALL
  SELECT 4 AS orderID, '2015-01-03' AS ts, 'confirmed' AS event UNION ALL
  SELECT 4 AS orderID, '2015-01-05' AS ts, 'canceled' AS event UNION ALL
  SELECT 5 AS orderID, '2015-01-01' AS ts, 'created' AS event UNION ALL
  SELECT 5 AS orderID, '2015-01-01' AS ts, 'shopping' AS event 
),
order_history AS (
  SELECT 
    orderID, 
    (SELECT STRING_AGG(events, ' > ') FROM t.events) AS history
  FROM (
    SELECT 
      orderID, 
      ARRAY(SELECT event FROM t.events ORDER BY ts ASC) events 
    FROM (
      SELECT 
        orderID, 
        ARRAY_AGG(STRUCT(event, ts)) events
      FROM order_events 
      GROUP BY orderID
    ) t
  ) t
)
SELECT *
FROM order_history
#WHERE REGEXP_EXTRACT(history, r'((?:created).*(?:canceled))') IS NOT NULL

以上结果将订单历史可视化为

orderID     history  
1           created > shopping > completed > confirmed > shipped     
2           created > shopping > completed > declined    
3           created > shopping > completed > confirmed > shipped     
4           created > shopping > completed > confirmed > canceled    
5           created > shopping

现在,如果您将使用WHERE子句取消注释最后一行 - 您将只获得与给定模式匹配的订单 - 在这种情况下:创建但未取消的订单。

orderID     history  
4           created > shopping > completed > confirmed > canceled

设置正确的Regexp可让您灵活地进行所需的分析过滤器

希望这会给你一个想法,你可以根据自己的特定需求进行扩展!

答案 1 :(得分:1)

两张桌子:

orderorder_events

在您创建订单数据的顺序中,在事件表中,您可以在订单后存储任何内容。事件表将有一个event列,可以描述遇到的操作。

关于查询,您只需加入两个并返回您感兴趣的那个。