删除最低数值

时间:2018-06-15 18:31:34

标签: r dplyr

我一直坚持这个dplyr操纵问题已经有一段时间了。

以下是我的数据的一小部分样本:dput(test)

structure(list(anon_screen_name = c("40492fd6e817cc25cea942be9eae7c1c5795ffa1", 
"862329793fdbcd666d660d9a9d2e3beceb07a0db", "862329793fdbcd666d660d9a9d2e3beceb07a0db", 
"862329793fdbcd666d660d9a9d2e3beceb07a0db", "862329793fdbcd666d660d9a9d2e3beceb07a0db", 
"862329793fdbcd666d660d9a9d2e3beceb07a0db", "862329793fdbcd666d660d9a9d2e3beceb07a0db", 
"862329793fdbcd666d660d9a9d2e3beceb07a0db", "a9c8719499b9ef73c78e85bada231591d807a821", 
"a9c8719499b9ef73c78e85bada231591d807a821"), resource_display_name = c("Quiz", 
"Quiz", "Quiz", "Quiz", "Quiz", "homework", "homework", "final_exam", 
"Quiz", "Quiz"), grade = c(0L, 0L, 0L, 3L, 1L, 0L, 1L, 1L, 1L, 
2L), max_grade = c(2L, 1L, 0L, 3L, 1L, 10L, 11L, 1L, 1L, 2L), 
    percent_grade = c("0", "0", "\\N", "100", "100", "0", "9.09", 
    "100", "100", "100")), class = c("tbl_df", "tbl", "data.frame"
), row.names = c(NA, -10L)) 

基本上,对于每个anon_screen_name,我想放弃作业的最低percent_graderesource_display_name)。

我开始写这个入门代码:

test %>% 
     mutate(percent_grade = as.numeric(percent_grade)) %>% 
     group_by(resource_display_name) %>% 
     summarise(min_percent_grade = min(percent_grade, na.rm = T))

但是这只显示了最低作业成绩而没有取出最低作业成绩的行

更新:

基本上,借用下面的评论,我想删除与percent_grade的最低值相关联的行,其中resource_display_name =='homework'

3 个答案:

答案 0 :(得分:2)

请尝试以下代码:

test %>% 
  mutate(percent_grade = as.numeric(percent_grade)) %>% 
  filter(resource_display_name == 'homework') %>% 
  filter(percent_grade > min(percent_grade, na.rm = T)) -> t1


test %>% 
  mutate(percent_grade = as.numeric(percent_grade)) %>% 
  filter(resource_display_name != 'homework') -> t2

rbind(t1,t2)

答案 1 :(得分:0)

以下内容将删除所有等于resource_display_name每组最小值的值。请注意,它是一个基本R解决方案,不需要外部包,例如dplyr

inx <- with(test, ave(as.numeric(percent_grade), resource_display_name, FUN = function(x) x != min(x, na.rm = TRUE)))
inx <- which(as.logical(inx))

test[inx, ]

答案 2 :(得分:0)

如果您只想删除单个记录,而不是所有等级最低的记录,则可以执行以下操作:

test %>%
    mutate(percent_grade = as.numeric(percent_grade)) %>%
    group_by(anon_screen_name) %>%
    mutate(lowest_grade = 1 * ((percent_grade == min(percent_grade, na.rm=TRUE)) & (resource_display_name == 'homework'))) %>%
    arrange(lowest_grade) %>%
    filter(row_number() != n()) %>%
    ungroup()