我需要按有效范围获取表的值。目前,我按表格的列进行分组,并确定年份的最小值和最大值。但这给我的例子的结果是错误的。我需要用句子sql或pentaho解决它
样本数据:
year validity Value1 Value2
2004 A B
2006 A C
2007 A B
2008 A B
SQL:
SELECT
min(anio), max(anio), value1, value2
FROM tabla
GROUP BY
value1, value2
错误的结果:
year min year max Value1 Value2
2004 2008 A B
2006 2006 A C
预期结果:
year min year max Value1 Value2
2004 2004 A B
2007 2008 A B
2006 2006 A C
请帮助解决此问题。
答案 0 :(得分:3)
您可以组合LAG()
和SUM()
窗口函数,以根据逻辑将行分为几组。分离后,计算最小值和最大值很简单。
例如:
select
min(year_validity) as year_min,
max(year_validity) as year_max,
min(value1) as value1,
min(value2) as value2
from (
select
*,
sum(init) over(order by year_validity) as grp
from (
select *,
case when not ( (value1, value2) = (
lag(value1) over(order by year_validity),
lag(value2) over(order by year_validity)
) ) then 1 else 0 end as init
from tabla
) x
) y
group by grp
order by value1, value2
结果:
year_min year_max value1 value2
-------- -------- ------ ------
2004 2004 A B
2007 2008 A B
2006 2006 A C
为便于记录,此案例的数据脚本为:
create table tabla (
year_validity int,
value1 varchar(10),
value2 varchar(10)
);
insert into tabla (year_validity, value1, value2) values (2004, 'A', 'B');
insert into tabla (year_validity, value1, value2) values (2006, 'A', 'C');
insert into tabla (year_validity, value1, value2) values (2007, 'A', 'B');
insert into tabla (year_validity, value1, value2) values (2008, 'A', 'B');