在这里学习SQL,我遇到了挑战。
我有下表:
tbl <- data.frame(
id_name = c("a", "a", "b", "c", "d", "f", "b", "c", "d", "f"),
value = c(1, -1, 1, 1, 1, 1, -1, -1, -1, -1),
score = c(1, 0, 1, 2, 3, 4, 3, 2, 1, 0),
date = as.Date(c("2001-1-1", "2002-1-1", "2003-1-1", "2005-1-1",
"2005-1-1", "2007-1-1", "2008-1-1", "2010-1-1",
"2011-1-1", "2012-1-1"), "%Y-%m-%d")
)
+---------+-------+-------+-----------+
| id_name | value | score | date |
+---------+-------+-------+-----------+
| a | 1 | 1 | 2001-1-1 |
| a | -1 | 0 | 2002-1-1 |
| b | 1 | 1 | 2003-1-1 |
| c | 1 | 2 | 2005-1-1 |
| d | 1 | 3 | 2005-1-1 |
| f | 1 | 4 | 2007-1-1 |
| b | -1 | 3 | 2008-1-1 |
| c | -1 | 2 | 2010-1-1 |
| d | -1 | 1 | 2011-1-1 |
| f | -1 | 0 | 2012-1-1 |
+---------+-------+-------+-----------+
我的目标是:
对于每个id_name,我想在当前行= id_name(包括)的日期之间获得tbl的最高分数(如果是破坏者)
例如,id_name&#39; a&#39;应该返回&#39; 2001-1-1&#39;因为它的分数是1 id_name&#39; b&#39;应该返回&#39; 2007-1-1&#39;因为它的分数是4:
+---------+----------+
| id_name | date |
+---------+----------+
| a | 2001-1-1 |
| b | 2007-1-1 |
+---------+----------+
这是我到目前为止所做的,
sqldf("
SELECT
id_name,
date,
score
FROM
tbl As d
WHERE
score = (
SELECT MAX(score)
FROM tbl As b
WHERE
date >= (
SELECT MIN(date)
FROM tbl
WHERE id_name = b.id_name
) AND
date <= (
SELECT MAX(date)
FROM tbl
WHERE id_name = b.id_name
)
)
")
问题是它返回具有全局最大值的行,而与当前行值无关
谢谢!
答案 0 :(得分:0)
我认为WHERE子句中的相关子查询符合这个要求:
SELECT id_name, date
FROM tbl as t1
WHERE score = (SELECT max(score) FROM tbl WHERE id_name = t1.id_name)