我有一个包含4列的时间序列数据集:a,b,c和时间戳。我想为所有三列创建垃圾箱以获取出现表。下表示例显示了示例数据:
a | b | c | timestamp
0.2 | 1.2 | 0.1 | 2018-01-01 00:00:00
0.3 | 2.2 | 0.2 | 2018-01-01 00:00:01
0.4 | 3.2 | 0.3 | 2018-01-01 00:00:02
0.2 | 0.2 | 0.4 | 2018-01-01 00:00:03
0.6 | 1.3 | 0.5 | 2018-01-01 00:00:04
0.7 | 2.4 | 0.1 | 2018-01-01 00:00:05
1.2 | 0.5 | 0.2 | 2018-01-01 00:00:06
3.2 | 1.7 | 0.3 | 2018-01-01 00:00:07
2.2 | 2.5 | 0.4 | 2018-01-01 00:00:08
1.3 | 3.7 | 0.5 | 2018-01-01 00:00:09
1.4 | 0.8 | 0.6 | 2018-01-01 00:00:10
etc.
现在我要创建这样的垃圾箱:
a >= 0. and a < 0.25 and b >= 0. and b < 0.25 and c >= 0. and c < 0.25
a >= 0. and a < 0.25 and b >= 0. and b < 0.25 and c >= 0.25 and c < 0.5
a >= 0. and a < 0.25 and b >= 0. and b < 0.25 and c >= 0.5 and c < 0.75
a >= 0. and a < 0.25 and b >= 0.25 and b < 0.5 and c >= 0. and c < 0.25
a >= 0. and a < 0.25 and b >= 0.25 and b < 0.5 and c >= 0.25 and c < 0.5
etc.
这样,我将获得一个出现表,其中将箱应用于每个维度。\
我有以下适用于一列的SQL:
select
bucket,
0.+(bucket-1)*0.25 as low_a,
0.+bucket*0.25 as high_a,
count(1) as cnt
from
(
select
width_bucket(a, 0., 4., 16) as bucket
from
table
)
group by bucket order by bucket;
我如何能够将存储桶应用于多列?