PostgreSQL中的表import os
from os import system
for ele in os.listdir(Path):
if ele.endswith('.sdf'):
chdir(Path + '/' + ele[0:5])
system('cat' + ' ' + '*.sdf' + '>' + ele[0:5] + '.sdf')
:
每个consecutive
都有一个se_id
从0到100 - 这里是0到9。
搜索模式:
idx
现在我正在寻找这种模式连续出现的最长时间
对于每个SELECT *
FROM consecutive
WHERE val_3_bool = 1
AND val_1_dur > 4100 AND val_1_dur < 5900
- 以及p_id
的{{1}}。
是否可以在纯SQL中计算?
答案 0 :(得分:2)
一种方法是使用行数方法的差异来获取每个的序列:
select pid, count(*) as in_a_row, sum(val1_dur) as dur
from (select t.*,
row_number() over (partition by pid order by idx) as seqnum,
row_number() over (partition by pid, val3_bool order by idx) as seqnum_d
from consecutive t
) t
group by (seqnun - seqnum_d), pid, val3_bool;
如果您正在寻找&#34; 1&#34;值,然后将where val3_bool = 1
添加到外部查询。为了理解为什么会这样,我建议你盯着子查询的结果,这样就可以理解为什么差异定义了连续的值。
然后,您可以使用distinct on
select distinct on (pid) t.*
from (select pid, count(*) as in_a_row, sum(val1_dur) as dur
from (select t.*,
row_number() over (partition by pid order by idx) as seqnum,
row_number() over (partition by pid, val3_bool order by idx) as seqnum_d
from consecutive t
) t
group by (seqnun - seqnum_d), pid, val3_bool;
) t
order by pid, in_a_row desc;
distinct on
不需要额外级别的子查询,但我认为这会使逻辑更清晰。
答案 1 :(得分:0)