Clickhouse-在嵌套列中按分组

时间:2019-05-27 10:56:55

标签: grouping clickhouse

如何在嵌套列中执行分组依据?

我有一个嵌套的列items.productName和items.amount

我想获取按productName每个值分组的金额总和

我能够使用数组连接来实现

SELECT items.productName as name, sum(items.amount) as amt from test 
array join items
group by items.productName

但是数组连接很慢,因此我们无法使用它们。

所以我尝试使用sumForEach(),但不确定如何按单个productName将结果分组

SELECT items.productName as name, sumForEach(items.amount) as amt from test
group by name

是否可以在不使用数组连接的情况下实现此功能?

谢谢。

1 个答案:

答案 0 :(得分:0)

还可以使用sumMap函数:

select result.1 as name, result.2 as amt
from (
      select sumMap(items.productName, items.amount) sum_per_keys,
            arrayJoin(arrayZip(sum_per_keys.1, sum_per_keys.2)) result
      from nested_columns_test)
order by name;

对CH 20.3.8.53的测试表明, sumMap array join


准备测试环境:

create table nested_columns_test(
  id Int32,
  items Nested(productName String, amount Int32)
) Engine = MergeTree()
order by (id);

insert into nested_columns_test
select number as id,
      arrayMap(x -> concat('product_', toString(x)), range(number % 32)) as `items.productName`, 
      arrayMap(x -> number + x, range(number % 32)) as `items.amount` 
from numbers(100*1000*1000);

SELECT items.productName as name, sum(items.amount) as amt 
from nested_columns_test 
array join items
group by items.productName
order by name;