如何从单一列中获得价值组合?

时间:2019-01-17 05:22:59

标签: sql google-bigquery

我正在尝试从BigQuery的单个列中获得不同的可能组合值。

假设我有这张桌子:

+---------------------------------------------+
|   date   |type  |payment |customer_no|status|
+---------------------------------------------+
|2019-01-02|Shirt |Cashless|        101|Cancel|
|2019-01-02|Jeans |Cashless|        133|OK    |
|2019-01-02|Jeans |Cash    |        102|OK    |
|2019-01-02|Cap   |Cash    |        144|OK    |
|2019-01-02|Shirt |Cash    |        132|OK    |
|2019-01-01|Jeans |Cash    |        111|Cancel|
|2019-01-01|Cap   |Cash    |        141|OK    |
|2019-01-01|Shirt |Cash    |        101|OK    |
|2019-01-01|Jeans |Cash    |        105|OK    |
  • 我想遵守规则:
    • 仅状态=“确定”
    • 没有重复的组合,例如衬衫,牛仔裤牛仔裤,衬衫
    • 每笔付款的分组及其组合(现金,无现金,无现金现金)

使用以下代码:

#standardSQL
SELECT  date,
        type,
        COUNT(customer_no) as total_customer_per_order_type,
        order_payment
FROM `blabla.order`
WHERE status = 'OK'
GROUP BY date, type , payment
ORDER BY date DESC, payment ASC

我刚获得单一类型的总客户

如何获取这样的表格:

http://imgur.com/7aECjpSl.png

1 个答案:

答案 0 :(得分:1)

下面是BigQuery标准SQL的答案,仅在您的帖子标题中回答确切的问题:

  

如何从单列中获取价值组合?

#standardSQL
CREATE TEMP FUNCTION test(a ARRAY<INT64>) 
RETURNS ARRAY<STRING>
LANGUAGE js AS '''
  var combine = function(a) {
    var fn = function(n, src, got, all) {
      if (n == 0) {
        if (got.length > 0) {
          all[all.length] = got;
        } return;
      }
      for (var j = 0; j < src.length; j++) {
        fn(n - 1, src.slice(j + 1), got.concat([src[j]]), all);
      } return;
    }
    var all = [];
    for (var i = 1; i < a.length; i++) {
      fn(i, a, [], all);
    }
    all.push(a);
    return all;
  } 
  return combine(a)
''';
WITH types AS (
  SELECT DISTINCT type, CAST(DENSE_RANK() OVER(ORDER BY type) AS STRING) type_num
  FROM `project.dataset.order`
  WHERE status = 'OK'
)
SELECT items, STRING_AGG(type ORDER BY type_num) types
FROM UNNEST(test(GENERATE_ARRAY(1,(SELECT COUNT(1) FROM types)))) AS items, 
UNNEST(SPLIT(items)) AS pos
JOIN types ON pos = type_num
GROUP BY items  

您可以使用以下问题中的示例数据来测试,玩转

#standardSQL
CREATE TEMP FUNCTION test(a ARRAY<INT64>) 
RETURNS ARRAY<STRING>
LANGUAGE js AS '''
  var combine = function(a) {
    var fn = function(n, src, got, all) {
      if (n == 0) {
        if (got.length > 0) {
          all[all.length] = got;
        } return;
      }
      for (var j = 0; j < src.length; j++) {
        fn(n - 1, src.slice(j + 1), got.concat([src[j]]), all);
      } return;
    }
    var all = [];
    for (var i = 1; i < a.length; i++) {
      fn(i, a, [], all);
    }
    all.push(a);
    return all;
  } 
  return combine(a)
''';
WITH `project.dataset.order` AS (
  SELECT '2019-01-02' dt, 'Shirt' type, 'Cashless' payment, 101 customer_no, 'Cancel' status UNION ALL
  SELECT '2019-01-02', 'Jeans', 'Cashless', 133, 'OK' UNION ALL
  SELECT '2019-01-02', 'Jeans', 'Cash', 102, 'OK' UNION ALL
  SELECT '2019-01-02', 'Cap', 'Cash', 144, 'OK' UNION ALL
  SELECT '2019-01-02', 'Shirt', 'Cash', 132, 'OK' UNION ALL
  SELECT '2019-01-01', 'Jeans', 'Cash', 111, 'Cancel' UNION ALL
  SELECT '2019-01-01', 'Cap', 'Cash', 141, 'OK' UNION ALL
  SELECT '2019-01-01', 'Shirt', 'Cash', 101, 'OK' UNION ALL
  SELECT '2019-01-01', 'Jeans', 'Cash', 105, 'OK' 
), types AS (
  SELECT DISTINCT type, CAST(DENSE_RANK() OVER(ORDER BY type) AS STRING) type_num
  FROM `project.dataset.order`
  WHERE status = 'OK'
)
SELECT items, STRING_AGG(type ORDER BY type_num) types
FROM UNNEST(test(GENERATE_ARRAY(1,(SELECT COUNT(1) FROM types)))) AS items, 
UNNEST(SPLIT(items)) AS pos
JOIN types ON pos = type_num
GROUP BY items

有结果

Row items   types    
1   1       Cap  
2   2       Jeans    
3   3       Shirt    
4   1,2     Cap,Jeans    
5   1,3     Cap,Shirt    
6   2,3     Jeans,Shirt  
7   1,2,3   Cap,Jeans,Shirt