将组的最大值分配给该组中的所有行

时间:2019-05-26 15:54:24

标签: r group-by dplyr max

我想将一个组的最大值分配给该组内的所有行。我该怎么办?

我有一个数据框,其中包含该组的名称和所属组的最大学分数。

course_credits <- aggregate(bsc_academic$Credits, by = list(bsc_academic$Course_code), max)

给出

    Course    Credits
1   ABC1000  6.5
2   ABC1003  6.5
3   ABC1004  6.5
4   ABC1007  5.0
5   ABC1010  6.5
6   ABC1021  6.5
7   ABC1023  6.5

主数据框如下所示:

Appraisal.Type   Resits   Credits Course_code   Student_ID          
Final result       0       6.5    ABC1000           10                
Final result       0       6.5    ABC1003           10               
Grade supervisor   0       0      ABC1000           10               
Grade supervisor   0       0      ABC1003           10 
Final result       0       12     ABC1294           23   
Grade supervisor   0       0      ABC1294           23     

如您所见,学生10修了ABC1000课程,价值6.5学分。但是,对于每门课程(每位学生),都有两行:最终结果和年级主管。最后,应删除最终结果,但应保留功劳。因此,我想将最大值6.5分配给“成绩主管”行。 同样,学生23已修读ABC1294课程,价值12个学分。

最后,应该是结果:

Appraisal.Type   Resits   Credits Course_code   Student_ID                      
Grade supervisor   0       6.5      ABC1000           10               
Grade supervisor   0       6.5      ABC1003           10    
Grade supervisor   0       12       ABC1294           23               

我该怎么办?

3 个答案:

答案 0 :(得分:2)

一种选择是按'Student_ID',mutate和'{Credits'的maxfilter的行进行分组,这些行的'Appraisal.Type'为“ Grade主管”

library(dplyr)
df1 %>%
   group_by(Student_ID) %>%
   dplyr::mutate(Credits = max(Credits)) %>%
   ungroup %>%
   filter(Appraisal.Type == "Grade supervisor")
# A tibble: 2 x 5
#  Appraisal.Type   Resits Credits Course_code Student_ID
#  <chr>             <int>   <dbl> <chr>            <int>
#1 Grade supervisor      0     6.5 ABC1000             10
#2 Grade supervisor      0     6.5 ABC1003             10

如果我们还需要在分组中包括“课程代码”

df2 %>%
  group_by(Student_ID, Course_code) %>% 
  dplyr::mutate(Credits = max(Credits)) %>%  
  filter(Appraisal.Type == "Grade supervisor")
# A tibble: 3 x 5
# Groups:   Student_ID, Course_code [3]
#  Appraisal.Type   Resits Credits Course_code Student_ID
#  <chr>             <int>   <dbl> <chr>            <int>
#1 Grade supervisor      0     6.5 ABC1000             10
#2 Grade supervisor      0     6.5 ABC1003             10
#3 Grade supervisor      0    12   ABC1294             23

注意:在这种情况下,plyr程序包也已加载,在功能{es1 summarise/mutate中可能有一些掩盖,也可以在plyr中找到。为防止这种情况,请在不加载plyr的情况下在全新会话中执行此操作,或明确指定dplyr::mutate

数据

df1 <- structure(list(Appraisal.Type = c("Final result", "Final result", 
"Grade supervisor", "Grade supervisor"), Resits = c(0L, 0L, 0L, 
0L), Credits = c(6.5, 6.5, 0, 0), Course_code = c("ABC1000", 
"ABC1003", "ABC1000", "ABC1003"), Student_ID = c(10L, 10L, 10L, 
10L)), class = "data.frame", row.names = c(NA, -4L)) 



df2 <- structure(list(Appraisal.Type = c("Final result", "Final result", 
"Grade supervisor", "Grade supervisor", "Final result", "Grade supervisor"
), Resits = c(0L, 0L, 0L, 0L, 0L, 0L), Credits = c(6.5, 6.5, 
0, 0, 12, 0), Course_code = c("ABC1000", "ABC1003", "ABC1000", 
"ABC1003", "ABC1294", "ABC1294"), Student_ID = c(10L, 10L, 10L, 
10L, 23L, 23L)), class = "data.frame", row.names = c(NA, -6L))

答案 1 :(得分:0)

生成样本数据集。

data <- as.data.frame(list(Appraisal.Type = c(rep("Final result", 2), rep("Grade supervisor", 2)), 
                      Resits = rep(0, 4), 
                      Credits = c(rep(6.5, 2), rep(0, 2)), 
                      Course_code = rep(c("ABC1000", "ABC1003"), 2), 
                      Student_ID = rep(10, 4)))

将一个组的最大值分配给该组中的所有行,然后删除包含“最终结果”的行。

##Reassign the values of "Credits" column
for (i in 1: nlevels(as.factor(data$Course_code))) {
  Course_code <- unique(data$Course_code)[i]
  data$Credits [data$Course_code == Course_code] <- max (data$Credits [data$Course_code == Course_code]) 
}
##New dataset without "Final result" rows
data <- data[data$Appraisal.Type != "Final result",]

这是结果。

data
    Appraisal.Type Resits Credits Course_code Student_ID
3 Grade supervisor      0     6.5     ABC1000         10
4 Grade supervisor      0     6.5     ABC1003         10

答案 2 :(得分:0)

这是一个"w+"解决方案,

data.table