help <- data.frame(id = c(5, 5, 7, 7, 18, 18, 42, 42, 46, 46, 50, 51),
grade = c("a", "a", "b", "b", "c", "c", "d", "d", "e", "e", "w", "z"),
pass = c("yes", "no", "yes", "no", "no", "no", "yes", "no", "yes", "yes", "yes", "no"))
使用帮助数据集,我想:
希望有一个看起来像这样的数据集:
id grade pass
5 a yes
7 b yes
42 d yes
46 e yes
46 e yes
我试图使用......
help %>% group_by(id, grade, pass) %>% filter(pass == "yes" & pass == "no")
但即便如此,因为它会删除所有内容并输出一个空的df。
答案 0 :(得分:1)
使用基础r
解决方案可能是:
help <- data.frame(id = c(5, 5, 7, 7, 18, 18, 42, 42, 46, 46, 50, 51),
grade = c("a", "a", "b", "b", "c", "c", "d", "d", "e", "e", "w", "z"),
pass = c("yes", "no", "yes", "no", "no", "no", "yes", "no", "yes", "yes", "yes", "no"))
# Keep duplicate Id and grades. The trick is to find duplicate from
# from start and then from last
help2 <- help[duplicated((help[,1:2])) | duplicated(help[,1:2], fromLast = TRUE),]
# Filter for the pass
help2[help2$pass == "yes",]
# id grade pass
#1 5 a yes
#3 7 b yes
#7 42 d yes
#9 46 e yes
#10 46 e yes
答案 1 :(得分:1)
我们可以group_by
基于id
和grade
,然后在计数数量大于1且pass
为yes
时进行过滤。
library(dplyr)
help %>%
group_by(id, grade) %>%
filter(n() > 1, pass %in% "yes") %>%
ungroup()
# # A tibble: 5 x 3
# id grade pass
# <dbl> <fct> <fct>
# 1 5.00 a yes
# 2 7.00 b yes
# 3 42.0 d yes
# 4 46.0 e yes
# 5 46.0 e yes
答案 2 :(得分:1)
subset(help,!duplicated(help)&pass=="yes")
id grade pass
1 5 a yes
3 7 b yes
7 42 d yes
9 46 e yes
11 50 w yes
答案 3 :(得分:0)
所以我加载它:
og_help <- data.frame(id = c(5, 5, 7, 7, 18, 18, 42, 42, 46, 46, 50, 51),
grade = c("a", "a", "b", "b", "c", "c", "d", "d", "e", "e", "w", "z"),
pass = c("yes", "no", "yes", "no", "no", "no", "yes", "no", "yes", "yes", "yes", "no"))
然后我返回一组唯一的行:
help <- unique(og_help)
仅将pass
变量设置为yes
的那些子集。
help <- help[ which(help$pass == "yes"), ]
这输出以下内容:
id grade pass
1 5 a yes
3 7 b yes
7 42 d yes
9 46 e yes
11 50 w yes