我有一个由日志和时间戳组成的表,例如:
timestmp log_error
1507031197631 Er7
1507031197621 Er8
1507031197409 Er9
1506888444602 Er10
1506880074401 Er10
1506880047684 Er10
1506880030996 Er10
1506879980929 Er10
1506879977580 Er10
1506879974250 Er10
1506879970901 Er10
1506879964241 Er10
1506879954212 Er10
1506879900817 Er10
我想在一段时间戳(5分钟)内编写一个忽略相同连续错误(在本例中为Er10)的SQL查询。我怎么能做到这一点?使用自我内联?我想要的结果是这样的:
timestmp log_error
1507031197631 Er7
1507031197621 Er8
1507031197409 Er9
1506888444602 Er10 /* The last one from this example, based on the difference in timestmp */
1506879900817 Er10 /* The first Er10 registry */
答案 0 :(得分:1)
您可以使用row_number创建连续的log_error值组。这种方法被称为“tabibitosan方法”
select log_error, min(timestmp), max(timestmp)
from (
select t.*,
row_number() over (order by timestmp)
- row_number() over (partition by log_error order by timestmp) as grp
from your_table t
) t
group by log_error, grp;
我承认结果格式并不完全符合您的要求,但它拥有您需要的信息。
答案 1 :(得分:0)
您可以使用lag()
,累计金额和group by
:
select log_error, min(timestamp), max(timestamp)
from (select l.*,
sum(case when prev_le = log_error and
prev_timestamp > timestamp - "5 minutes"
then 0 else 1
end) over (order by timestamp) as grp
from (select l.*,
lag(log_error) over (order by timestmp) as prev_le,
lag(timestmp) over (order by timestmp) as prev_timestmp
from logs l
) l
) l
group by grp, log_error;
注意:- "5 minutes"
旨在成为其逻辑。据推测,这可能是5 * 60
或5 * 60 * 1000
。