我在R中有一个数据框,如果要匹配某些条件,我想针对该数据框删除某些行。我该怎么办?
我尝试使用dplyr
和ifelse
,但是我的代码未给出正确答案
check8 <- distinct(df5,prod,.keep_all = TRUE)
不起作用!给出整个数据集
输入为:
check1 <- data.frame(ID = c(1,1,2,2,2,3,4),
prod = c("R","T","R","T",NA,"T","R"),
bad = c(0,0,0,1,0,1,0))
# ID prod bad
# 1 1 R 0
# 2 1 T 0
# 3 2 R 0
# 4 2 T 1
# 5 2 <NA> 0
# 6 3 T 1
# 7 4 R 0
预期输出:
data.frame(ID = c(1,2,3,4),
prod = c("R","R","T","R"),
bad = c(0,0,1,0))
# ID prod bad
# 1 1 R 0
# 2 2 R 0
# 3 3 T 1
# 4 4 R 0
我想要具有这样的输出,以便对于同时存在prod或NA
的ID,仅保留带有prod R
的行,但是如果仅存在一个prod,则尽管存在prod,也请保留该行
答案 0 :(得分:2)
我们可以使用dplyr
选择filter
所在的行,或者如果组中只有一行,则选择该行。
prod == "R"
答案 1 :(得分:2)
这里使用anti_join
library(dplyr)
check1 <- data.frame(ID = c(1,1,2,2,2,3,4), prod = c("R","T","R","T",NA,"T","R"), bad = c(0,0,0,1,0,1,0))
# First part: select all the IDs which contain 'R' as prod
p1 <- check1 %>%
group_by(ID) %>%
filter(prod == 'R')
# Second part: using anti_join get all the rows from check1 where there are not
# matching values in p1
p2 <- anti_join(check1, p1, by = 'ID')
solution <- bind_rows(
p1,
p2
) %>%
arrange(ID)