一个表由三列组成(时间,键,值)。任务是计算每个键的运行差异。 因此,从输入
---------------
| time | key | value |
---------------
| 1 | A | 4 |
| 2 | B | 1 |
| 3 | A | 6 |
| 4 | A | 7 |
| 5 | B | 3 |
| 6 | B | 7 |
希望得到
----------------------
| key | value | delta |
----------------------
| A | 4 | 0 |
| B | 1 | 0 |
| A | 6 | 2 |
| A | 7 | 1 |
| B | 3 | 2 |
| B | 7 | 4 |
runningDifference
函数。有效,如果密钥是固定的。所以我们可以
select *, runningDifference(value) from
(SELECT key, value from table where key = 'A' order by time)
请注意,此处需要子查询 。当您想为不同的 key s
groupArray
。
select key, groupArray(value) from
(SELECT key, value from table order by time)
group by key
因此,现在我们得到了一个键和带有该键的一系列元素。好。
但是如何计算滑动差异?如果我们能够做到这一点,那么ARRAY JOIN
会导致我们得到结果。
或者我们甚至可以将数组zip
与其自身结合起来,然后应用lambda(为此我们有arrayMap
),但是...我们没有任何zip
替代方案。
有什么想法吗? 预先感谢。
答案 0 :(得分:4)
数组解决方案:
WITH
groupArray(value) as time_sorted_vals,
arrayEnumerate(time_sorted_vals) as indexes,
arrayMap( i -> time_sorted_vals[i] - time_sorted_vals[i-1], indexes) as running_diffs
SELECT
key,
running_diffs
FROM
(SELECT key, value from table order by time)
GROUP by key
其他选项(分别在每个组内进行排序,在很多情况下这是最佳选择)
WITH
groupArray( tuple(value,time) ) as val_time_tuples,
arraySort( x -> x.2, val_time_tuples ) as val_time_tuples_sorted,
arrayMap( t -> t.1, indexes) as time_sorted_vals,
arrayEnumerate(time_sorted_vals) as indexes,
arrayMap( i -> time_sorted_vals[i] - time_sorted_vals[i-1], indexes) as running_diffs
SELECT
key,
running_diffs
FROM
time
GROUP by key
,然后可以对结果应用ARRAY JOIN。
答案 1 :(得分:0)
最近我也遇到了问题,Clickhouse提供了功能arrayDifference
。
WITH
groupArray(value) as vals
arrayDifference(vals) as running_diffs
SELECT
key,
running_diffs
FROM
(SELECT key, value from table order by time)
GROUP by key