我有一个如下所示的数据框:
summary(imputedWork)
everwrk age_p
1 Yes :27918 Min. :18.00
2 No : 5034 1st Qu.:33.00
7 Refused : 45 Median :47.00
8 Not ascertained: 0 Mean :48.11
9 Don't know : 17 3rd Qu.:62.00
Max. :85.00
r_maritl
1 Married - spouse in household:13943
7 Never married : 7763
5 Divorced : 4511
4 Widowed : 3069
8 Living with partner : 2002
6 Separated : 1121
(Other) : 605
我想删除everwrk中的“拒绝”,“不知道”和“未确定”值以及r_maritl中的“(其他)”值。
答案 0 :(得分:1)
当与您不需要的值匹配时,这将删除该行
A=c("Refused","Don't Know", "Not ascertained")
B=c("Married - spouse in household",
"Never married","Divorced","Widowed","Living with partner","Separated")
imputedWork[!imputedWork$everwrk %in% A & imputedWork$r_maritl %in% B,]
答案 1 :(得分:0)
dplyr
解决方案:
imputedWork <- imputedWork %>%
filter(
(everwrk=="Yes" | everwrk=="No") & r_maritl!="(Other)"
)
如果everwrk
和r_maritl
为factor
,您还要删除这些级别:
imputedWork <- imputedWork %>%
filter(
(everwrk=="Yes" | everwrk=="No") & r_maritl!="(Other)"
) %>%
mutate(everwork=droplevels(everwrk),
r_maritl=droplevels(r_maritl))