我想删除数据框中var1
或var2
有数字的行。
我尝试过不同的正则表达式,但它们似乎都没有用。
以下是查找var1
上的数字的简化示例。
var1<-c("000","1","1","1","1","000256","1","shall","1","1","1","1","the","1",
"001","1","1","1","1","one")
var2<-c("people","0","00","000","1","2","3","begin","4","5","6","7","8","a",
"and","billion","hour","in","is","million")
val<-c(1639,1703,655,3542,3273,9658,2562,1027,3340,2236,971,783,1057,673,1658,
1367,843,1921,459,2589)
df<-data.frame(var1,var2,val)
df<-subset(df,df$var1!="[0-9]*")
我做错了什么?
答案 0 :(得分:3)
试试这个
subset(df, !grepl("[0-9]", var1) & !grepl("[0-9]", var2))
# var1 var2 val
# 8 shall begin 1027
# 20 one million 2589
答案 1 :(得分:2)
与lukeA相同,但使用filter()
dplyr
library(dplyr)
df %>% filter(!grepl("[[:digit:]]", var1) & !grepl("[[:digit:]]", var2))
或slice()
df %>% slice(which(!grepl("[[:digit:]]", var1) & !grepl("[[:digit:]]", var2)))
或根据@akrun建议:
df %>% slice(intersect(grep("[^0-9]", var1), grep("[^0-9]", var2)))
给出了:
# var1 var2 val
#1 shall begin 1027
#2 one million 2589