如何对数组和其他数据求和?

时间:2019-05-04 08:25:25

标签: sql clickhouse

如何使用简单的SQL获得相同的结果?

我有两个这样的桌子。

create table t1_before
(
  k1 String,
  ts DateTime,
  span  Int32,
  iserror  Int32
)
ENGINE = MergeTree()
ORDER BY (k1, ts)
;
insert into t1_before values('key1','2019-05-04 10:00:00',1,0);
insert into t1_before values('key1','2019-05-04 10:00:00',1,0);
insert into t1_before values('key1','2019-05-04 10:00:00',1,1);
insert into t1_before values('key1','2019-05-04 10:00:00',2,0);
insert into t1_before values('key1','2019-05-04 10:00:00',2,0);
insert into t1_before values('key1','2019-05-04 10:00:00',2,1);
insert into t1_before values('key1','2019-05-04 10:00:00',2,1);
insert into t1_before values('key1','2019-05-04 10:00:00',2,1);
create table t1
(
  k1 String,  
  ts DateTime, 
  totalspan  Int32,  
  maxspan  Int32, 
  totalcount  Int32,   
  errorcount Int32, 
  goal Nested    
    (
        m UInt32,  
        n UInt32
)
)
ENGINE = MergeTree()
ORDER BY (k1, ts)
;

表t1是t1_before的汇总。 Goal.m是跨度,goal.n是计数。 交换到t1之前,t1_中的数据。 像这样:

insert into t1 values('key1','2019-05-04 10:00:00', 13, 2, 7, 2, [1,2],[3,5]);

t1_before行太多,所以实际上我只有表t1。

如果数据是

insert into t1 values('key1','2019-05-04 10:00:00', 13, 2, 7, 2, [1,2],[3,5]);
insert into t1 values('key1','2019-05-04 10:00:20', 25, 4, 8, 3, [1,2,4],[1,2,5]);
insert into t1 values('key1','2019-05-04 11:02:30', 13, 2, 8, 1, [1,2],[3,5]);
insert into t1 values('key2','2019-05-04 10:00:00', 13, 2, 8, 3, [1,2],[3,5]);
insert into t1 values('key2','2019-05-04 10:02:00', 13, 2, 8, 0, [1,2],[3,5]);

我知道如何获得结果,但是很复杂。

SELECT 
    d1.k1, d1.ts2, d1.a1, 
    d2.sumtotalspan, d2.maxtotalspan, d2.sumtotalcount, d2.sumerrorcount
FROM 
(
    SELECT 
        k1, ts2, quantilesExactWeighted(0.5, 0.9, 0.99)(m1, n1) AS a1
    FROM 
    (
        SELECT 
            k1, 
            toStartOfHour(ts) AS ts2, 
            goal.m AS m1, 
            sum(goal.n) AS n1
        FROM t1 
        ARRAY JOIN goal
        GROUP BY  k1, toStartOfHour(ts), goal.m
    ) 
    GROUP BY k1, ts2
) AS d1 
INNER JOIN 
(
    SELECT 
        k1, 
        toStartOfHour(ts) AS ts2, 
        sum(totalspan) AS sumtotalspan, 
        max(totalspan) AS maxtotalspan, 
        sum(totalcount) AS sumtotalcount, 
        sum(errorcount) AS sumerrorcount
    FROM t1 
    GROUP BY k1, toStartOfHour(ts)
) AS d2 ON (d1.k1 = d2.k1) AND (d1.ts2 = d2.ts2)

┌─k1┬─ts2─┬─a1──┬─sumtotalspan─┬─maxtotalspan─┬─sumtotalcount─┬sumerrorcount │key1│2019-05-04 10:00:00│[2,4,4]│38│25│15│5│

│key2│2019-05-04 10:00:00│[2,2,2]│26│13│16│3│

│key1│2019-05-04 11:00:00│[2,2,2]│13│13│8│1│ p──────┴──────────────── 一组

3行。

是否有任何简单的SQL(删除联接)得到相同的结果? 这样,但是是错误:

SELECT 
            k1, 
            toStartOfHour(ts) AS ts2, 
sum(totalspan) AS sumtotalspan, 
        max(totalspan) AS maxtotalspan, 
        sum(totalcount) AS sumtotalcount, 
        sum(errorcount) AS sumerrorcount,
            quantilesExactWeighted(0.5, 0.9, 0.99)(sumMap(goal.m, goal.n))
        FROM t1 
        GROUP BY  k1, toStartOfHour(ts)

2 个答案:

答案 0 :(得分:0)

您可以使用Goal.m / goal.n和arrayReduce尝试这样的事情:

SELECT arrayReduce('sumMap', [[1, 2, 3, 3]], [[4, 5, 6, 7]])
FORMAT TSV

([1,2,3],[4,5,13])

答案 1 :(得分:0)

clickhouse很棒,功能可以组合。 在“聚合函数组合器https://clickhouse.yandex/docs/en/query_language/agg_functions/combinators/”中,我看到“ -Array”。因此我发现此函数似乎得到相同的结果: QuantilesExactWeightedArray(0.5,0.9,0.99)(goal.m,目标.n)