我的数据集如下
dput(data2)
structure(list(School = structure(c(1L, 1L, 1L, 2L, 2L, 2L, 3L,
3L, 3L), .Label = c("School1", "School2", "School3"), class = "factor"),
Year = c(2015L, 2014L, 2013L, 2015L, 2014L, 2013L, 2015L,
2014L, 2013L), Rate = c(70L, 50L, 30L, 80L, 90L, 11L, 60L,
50L, 40L)), .Names = c("School", "Year", "Rate"), class = "data.frame", row.names = c(NA,
-9L))
School Year Rate
1 School1 2015 70
2 School1 2014 50
3 School1 2013 30
4 School2 2015 80
5 School2 2014 90
6 School2 2013 11
7 School3 2015 60
8 School3 2014 50
9 School3 2013 40
我正在尝试做的是产生一个输出,其中数据按列学校分组,学校的顺序按照2015年的降序排列。
所以输出应该像
School Year Rate
1 School2 2015 80
2 School2 2014 90
3 School2 2013 11
4 School1 2015 70
5 School1 2014 50
6 School1 2013 30
7 School3 2015 60
8 School3 2014 50
9 School3 2013 40
使用我的示例中的数据,顺序将如下,基于rate的降序值。 School2 - > School1 - > School3 80 - > 70 - > 60
我尝试使用dplyr包来获得所需的输出但尚未达到结果。
答案 0 :(得分:2)
我们可以在2015年找到该费率,然后根据列进行排列。
library(dplyr)
dat2 <- dat %>%
group_by(School) %>%
mutate(Year2015 = Rate[Year == 2015]) %>%
arrange(desc(Year2015), desc(Year)) %>%
ungroup(School) %>%
select(-Year2015)
dat2
# # A tibble: 9 x 3
# School Year Rate
# <fct> <int> <int>
# 1 School2 2015 80
# 2 School2 2014 90
# 3 School2 2013 11
# 4 School1 2015 70
# 5 School1 2014 50
# 6 School1 2013 30
# 7 School3 2015 60
# 8 School3 2014 50
# 9 School3 2013 40
答案 1 :(得分:1)
首先计算每所学校的最高费率(MaxRate
)的选项。然后在MaxRate
和Year
上按降序排列数据。
library(dplyr)
data2 %>% group_by(School) %>%
mutate(MaxRate = max(Rate)) %>%
arrange(desc(MaxRate), desc(Year)) %>%
ungroup() %>%
select(-MaxRate) %>% as.data.frame()
# School Year Rate
# 1 School2 2015 80
# 2 School2 2014 90
# 3 School2 2013 11
# 4 School1 2015 70
# 5 School1 2014 50
# 6 School1 2013 30
# 7 School3 2015 60
# 8 School3 2014 50
# 9 School3 2013 40