惯用语相当于地图结构

时间:2017-06-21 08:54:48

标签: google-bigquery

我的分析需要聚合行并在所有行中存储字段someField的不同值出现次数。

示例数据结构 [someField, someKey]

我正在尝试GROUP BY someKey然后能够知道每个结果每个someField值有多少时间

示例:

[someField: a, someKey: 1],
[someField: a, someKey: 1],
[someField: b, someKey: 1],
[someField: c, someKey: 2],
[someField: d, someKey: 2]

我想要实现的目标:

[someKey: 1, fields: {a: 2, b: 1}],
[someKey: 2, fields: {c: 1, d: 1}],

3 个答案:

答案 0 :(得分:2)

它对你有用吗?

WITH data AS (
  select 'a' someField, 1 someKey UNION all
  select 'a', 1 UNION ALL
  select 'b', 1 UNION ALL
  select 'c', 2 UNION ALL
  select 'd', 2)

SELECT
  someKey,
  ARRAY_AGG(STRUCT(someField, freq)) fields
FROM(
  SELECT
    someField,
    someKey,
    COUNT(someField) freq
  FROM data
  GROUP BY 1, 2
)
GROUP BY 1

结果:

enter image description here

它不会准确地给出您要查找的结果,但它可能会收到您之前的结果相同的查询。正如您所说,对于每个key,您可以检索多少次(freq} someField列。{/ p>

我一直在寻找一种如何聚合结构而无法找到结构的方法。但是将结果作为STRULTS的ARRAY检索结果非常简单。

答案 1 :(得分:1)

可能有一种更聪明的方法(并以你想要的格式获得它,例如使用第二列的数组),但这对你来说已经足够了:

with sample as (
select 'a' as someField, 1 as someKey UNION all
select 'a' as someField, 1 as someKey UNION ALL
select 'b' as someField, 1 as someKey UNION ALL
select 'c' as someField, 2 as someKey UNION ALL
select 'd' as someField, 2 as someKey)

SELECT
  someKey,
  SUM(IF(someField = 'a', 1, 0)) AS a,
  SUM(IF(someField = 'b', 1, 0)) AS b,
  SUM(IF(someField = 'c', 1, 0)) AS c,
  SUM(IF(someField = 'd', 1, 0)) AS d
FROM
  sample
GROUP BY
  someKey order by somekey asc

结果:

someKey a   b   c   d
---------------------    
  1     2   1   0   0    
  2     0   0   1   1

这是BigQuery中使用得很好的技术(参见here)。

答案 2 :(得分:0)

  

我试图通过someKey进行GROUP BY,然后能够知道每个结果有多少时间每个someField值

  
#standardSQL
SELECT
  someKey,
  someField,
  COUNT(someField) freq
FROM yourTable
GROUP BY 1, 2
-- ORDER BY someKey, someField  
  

我想达到的目标:
     [someKey:1,fields:{a:2,b:1}],
     [someKey:2,fields:{c:1,d:1}],

这与您在文字中表达的内容不同 - 它被称为轮播并基于您的评论 - The a, b, c, and d keys are potentially infinite - 很可能不是您需要的。同时 - 旋转也很容易(如果你有一些有限数量的字段值),你可以找到很多相关的帖子