Redshift中的LAG窗口功能-在单行中显示上一年和当前年份的值

时间:2019-02-16 00:51:28

标签: amazon-redshift analytics

对于一组列组合,我需要在一行中显示上一年和当前年份的值。该方案如下: 我有一个像这样的数据集:

Student City    Country Year Month Subject Marks
John    Boston  USA    2018  01    Maths   90
Mark    London  UK     2018  01    Maths   95
John    Boston  USA    2019  01    Maths   95
Mark    London  UK     2019  01    Maths   83
John    Boston  USA    2018  01    Arts    90
Mark    London  UK     2018  01    Arts    95
John    Boston  USA    2019  01    Arts    95
Mark    London  UK     2019  01    Arts    83

我希望输出为:

Student  City  Country  Year  Month  Maths_curr  Maths_prev  Arts_curr Arts_prev  
John     Boston USA     2019  01     95          90          95        90
John     Boston USA     2018  01     90          null        90        null
Mark     London UK      2019  01     83          95          83        95
Mark     London UK      2018  01     95          null        95        null 

我认为,我需要使用LAG函数来获取此代码...我使用了此代码

select student,city,country,year,month,subject,marks as curr,
lag(marks,1)over(partition by student,city,country,subject order by year,month) as prev
from <table>
order by student,city,country,year,month

我得到的输出是:

Student City    Countr  Year Month Subject  Curr  Prev
John    Boston  USA    2019  01    Maths    95    90
John    Boston  USA    2018  01    Maths    90    null
John    Boston  USA    2019  01    Arts     95    90
John    Boston  USA    2018  01    Arts     90    null
Mark    London  UK     2019  01    Maths    83    95
Mark    London  UK     2018  01    Maths    95    null
Mark    London  UK     2019  01    Arts     83    95
Mark    London  UK     2018  01    Arts     95    null

您能帮助我获得所需的输出吗?是LEAD或LAG,在这种情况下使用的正确功能吗?在Redshift中还有其他方法可以实现这一目标吗?

非常感谢您的帮助。

我也尝试了这段代码。

select student,city,country,year,month,subject,
case when substring(curr,1,1) = 'M' then cast(split_part(curr,' ',2) as integer) end as maths_curr,
case when substring(prev,1,1) = 'M' then cast(split_part(prev,' ',2) as integer) end as maths_prev,
case when substring(curr,1,1) = 'A' then cast(split_part(curr,' ',2) as integer) end as arts_curr,
case when substring(prev,1,1) = 'A' then cast(split_part(prev,' ',2) as integer) end as arts_prev
from
(select student,city,country,year,month,subject,
case when subject = 'MATHS' then 'M ' + cast(nvl(marks,0) as varchar)
     else 'A ' + cast(nvl(marks,0) as varchar)
     end as curr,
case when subject = 'MATHS' then 'M ' + cast(nvl(lag(marks,1)over (partition by student,city,country,subject order by year,mth),0) as varchar)
     else 'A ' + cast(nvl(lag(marks,1)over (partition by student,city,country,subject order by year,mth),0) as varchar)
     end as prev
from <table>
order by student,city,country,year,month)

在此,我得到的输出为:

Student City    Country Year Month Subject  Maths_Curr  Maths_Prev   Arts_Curr   Arts_Prev
John    Boston  USA    2019  01    Maths    95          90           null        null
John    Boston  USA    2018  01    Maths    90          null         null        null
John    Boston  USA    2019  01    Arts     null        null         95          90
John    Boston  USA    2018  01    Arts     null        null         90          null
Mark    London  UK     2019  01    Maths    83          95           null        null
Mark    London  UK     2018  01    Maths    95          null         null        null
Mark    London  UK     2019  01    Arts     null        null         83          95
Mark    London  UK     2018  01    Arts     null        null         95          null

不确定我到底要去哪里。.在这里需要一些指导。...

1 个答案:

答案 0 :(得分:1)

这应该可以解决问题:

WITH base AS (
  SELECT *,
         CASE WHEN "Subject" = 'Maths' THEN "Marks" ELSE NULL END AS maths_current,
         CASE WHEN "Subject" = 'Arts' THEN "Marks" ELSE NULL END AS arts_current,
         CASE WHEN "Subject" = 'Maths' THEN LAG("Marks") OVER (PARTITION BY "Student","City","Country","Subject" ORDER BY "Year","Month") ELSE NULL END AS previous_math,
         CASE WHEN "Subject" = 'Arts' THEN LAG("Marks") OVER (PARTITION BY "Student","City","Country","Subject" ORDER BY "Year","Month") ELSE NULL END AS previous_arts
  FROM <table>
)

SELECT "Student",
       "City",
       "Country",
       "Year",
       "Month",
       MAX(maths_current) AS Maths_curr,
       MAX(previous_math) AS Maths_prev,
       MAX(arts_current) AS Arts_curr,
       MAX(previous_arts) AS Arts_prev
FROM base
GROUP BY 1,2,3,4,5
ORDER BY 1,2,3,4 DESC,5 DESC