当db time-slice大于所需的时间片时,如何从数据中查询时间片。最终结果将用于绘制堆积条形图。
示例数据:
START_TS (int)| END_TS (int) | DATA (int) | GROUP
-----------------------------------
0 | 179 | 2000 | G1
180 | 499 | 1000 | G2
500 | 699 | 1000 | G1
845 ...
使用时间片作为100“单位”的通缉输出。输出中不需要End_ts,但有助于理解计算。
START_TS | END_TS | DATA (equation = amount in that time slice) | GROUP
-------------------------------------------------------
0 | 99 | (2000 / 180) * 100 = 1111 | G1
100 | 199 | (2000 / 180) * 80 = 889 | G1
100 | 199 | (1000 / 320) * 20 = 63 | G2
200 | 299 | (1000 / 320) * 100 = 313 | G2
300 | 399 | (1000 / 320) * 100 = 313 | G2
400 | 499 | (1000 / 320) * 100 = 313 | G2
从中获取时间序列是这样的。
SELECT (startts/100)*100, ...
FROM TABLE
FULL JOIN
( SELECT startts from generate_series(0,700,100) startts ) s1
USING (startts)
GROUP BY startts/100
所以它会是这样的(没有分组)
STARTTS | ENDTS | DATA | GROUP
0 | 179 | 2000 | G1
100 |
180 | 499 | 1000 | G2
200 |
300 |
400 |
500 | 699 | 1000 | G1
600 |
700
但是如何将DATA分成两个或多个生成的行(时间片行),以时间片计算。
**这基本上有效,但在大数据集上并没有真正起作用。行如1-100M行。
以下是执行此操作的查询+更多内容以聚合不与时间片重叠的值
SELECT (start_ts/100)*100 as start_ts, sum(part) as data, cgroup
FROM (
SELECT *, ( data * (overlap_end-overlap_start + 1 ) / ( end_ts - tts + 1 ) ) as part
FROM
(
SELECT (case when s1.start_ts > t.start_ts then s1.start_ts else t.start_ts end) as overlap_start,
(case when s1.start_ts+100 < t.end_ts then s1.start_ts+100-1 else t.end_ts end) as overlap_end,
t.start_ts as tts, s1.start_ts as start_ts, t.end_ts, cgroup, data
FROM (SELECT start_ts from generate_series(0,800,100) start_ts ) s1
LEFT OUTER JOIN test t on t.start_ts < s1.start_ts+100 and t.end_ts >= s1.start_ts
) t
) t2
GROUP BY start_ts/100, cgroup
答案 0 :(得分:1)
您需要的是将不同的时隙分成由序列定义的箱。以下查询通过修改连接条件并计算两者之间的重叠来完成此操作:
SELECT (startts/100)*100, ...
from (select (case when s1.starts > t.start_ts then s1.starts else t.start_t2 end) as overlap_start,
(case when s1.starts+100 < t.end_ts then s1.starts+100-1 else t.end_ts end) as overlap_end,
ts.*
FROM (SELECT startts from generate_series(0,700,100) startts ) s1 left outer join
TABLE t
on t.startts < s1.starts+100 and
t.end_ts >= s1.starts
) t
答案 1 :(得分:0)
SQL Fiddle。为了清楚起见,它显示了每个步骤中的所有计算列。
with data_avg as (
select start_ts, end_ts, "data" * 1.0 / ((end_ts + 1) - start_ts) data_avg
from test
), gs as (
select start_ts, start_ts + 99 end_ts
from generate_series(
(select min(start_ts) from test),
(select max(end_ts) from test),
100
) gs(start_ts)
)
select
t_start, t_end,
gs_start, gs_end,
cgroup,
s."start", s."end",
da.start_ts da_start, da.end_ts da_end
,round((s."end" - s."start" + 1) * da.data_avg) "data"
from (
select
t.start_ts t_start, t.end_ts t_end,
gs.start_ts gs_start, gs.end_ts gs_end,
cgroup,
greatest(t.start_ts, gs.start_ts) "start", least(t.end_ts, gs.end_ts) "end"
from
test t
inner join
gs on
gs.start_ts between t.start_ts and t.end_ts
or
gs.end_ts between t.start_ts and t.end_ts
) s
inner join
data_avg da on
da.start_ts between t_start and t_end
and
da.end_ts between t_start and t_end
order by s."start"
结果:
t_start | t_end | gs_start | gs_end | cgroup | start | end | da_start | da_end | data
---------+-------+----------+--------+--------+-------+-----+----------+--------+------
0 | 179 | 0 | 99 | G1 | 0 | 99 | 0 | 179 | 1111
0 | 179 | 100 | 199 | G1 | 100 | 179 | 0 | 179 | 889
180 | 499 | 100 | 199 | G2 | 180 | 199 | 180 | 499 | 63
180 | 499 | 200 | 299 | G2 | 200 | 299 | 180 | 499 | 313
180 | 499 | 300 | 399 | G2 | 300 | 399 | 180 | 499 | 313
180 | 499 | 400 | 499 | G2 | 400 | 499 | 180 | 499 | 313
500 | 699 | 500 | 599 | G1 | 500 | 599 | 500 | 699 | 500
500 | 699 | 600 | 699 | G1 | 600 | 699 | 500 | 699 | 500