我有一张学生考试成绩表,以及参加的年份和参加的具体考试,如下所示:
Student ID Score Year TestName GradeLevel
100001 347 2010 Algebra 8
100001 402 2011 Geometry 9
100001 NA NA NA 10
100001 NA NA NA 11
100001 525 2014 Calculus 12
此示例表中只有一个学生ID,但是我的实际数据显然其中有许多学生ID。
我正在尝试编写一个查询,该查询将告诉我每个学年中每个学生最近参加的测验的分数,该测验的分数以及所处的年级。我想要的应该是这样的:
StudentID Year MostRecentScore MostRecentTest MostRecentTestGrade
100001 2010 347 Algebra 8
100001 2011 402 Geometry 9
100001 NA 402 Geometry 9
100001 NA 402 Geometry 9
100001 2014 525 Calculus 12
这是到目前为止我得到的:
SELECT
STUDENTID,
YEARID,
MAX(Score) OVER (PARTITION BY StudentID ORDER BY Year) as "MostRecentScore",
MAX(TestName) OVER (PARTITION BY StudentID ORDER BY Year) as "MostRecentTest",
MAX(GradeLevel) OVER (PARTITION BY StudentID ORDER BY Year) as "MostRecentTestGrade"
FROM TEST_SCORES
但这只会返回最新的测试及其相关值:
StudentID Year MostRecentScore MostRecentTest MostRecentTestGrade
100001 2010 525 Calculus 12
100001 2011 525 Calculus 12
100001 NA 525 Calculus 12
100001 NA 525 Calculus 12
100001 2014 525 Calculus 12
任何帮助将不胜感激。
答案 0 :(得分:0)
根据您的示例,我们可以使用gradelevel
确定顺序。如果是这样,您可以使用lag() ignore nulls
。但是要做到这一点,首先,我们需要对列gradelevel
进行操作,这有点麻烦。我添加了gl
,当year也为null时也为null。其余的很简单:
select studentid, yearid,
nvl(score, lag(score) ignore nulls
over (partition by studentid order by gradelevel)) score,
nvl(testname, lag(testname) ignore nulls
over (partition by studentid order by gradelevel)) test,
nvl(gl, lag(gl) ignore nulls
over (partition by studentid order by gradelevel)) grade
from (select ts.*, case when yearid is null then null else gradelevel end gl
from test_scores ts)
order by studentid, gradelevel
我假设值NA
在您的数据中为空。如果不是,则必须首先使用nullif
,对于score
列中的数字,请使用to_number
。将score
和YEAR
之类的列设为varchars是个坏主意。