如何使用str_which从Vector中选择包含字符串的行

时间:2019-01-18 11:41:14

标签: r stringr

我有一个这样的桌子

name    <- c("Goku","Vegeta","Jiren","Gohan","Piccolo","Kurinin","Trunks","Buu","Frieza","Cell","Muten","Gotens")
surname <- c("San","San","San","San","San","San","San","Majin","Evil","San","Roshi","San")
email   <- c("goku@gmail.com","vegeta@gmail.com","jiren@patrol.ch","gohan@gmail.com","piccolo@gmail.com","kurinin@gmail.com","Trunks@gmail.com","buu@babidi.com","frieza@rampage.usa","cell@rampage.usa","muten@gmail.com","gotens@gmail.com")

table <- data.frame(name, surname, email, stringsAsFactors = FALSE)

我在电子邮件地址中有一个带有不同结尾的Vector。我想找到所有使用带有此地址结尾的电子邮件地址的行

searchvector = c("@patrol.ch", "@babidi.com", "@rampage.usa")
searchvector = as.character(searchvector)

我尝试通过两种方式搜索包含searchvector的行:

A。使用str_detect:

table[str_detect(table$email, "@patrol.ch|@babidi.com|@rampage.usa"), ]

这给了我正确的结果

name surname              email  
3   Jiren     San    jiren@patrol.ch  
8     Buu   Majin     buu@babidi.com  
9  Frieza    Evil frieza@rampage.usa  
10   Cell     San   cell@rampage.usa 

B。但是当使用str_which时,我总是只得到两行

table[str_which(table$email, searchvector), ]
table[str_which(table$email, c("@patrol.ch", "@babidi.com", "@rampage.usa")), ]

在两种情况下我都得到以下结果:

name surname email  
8 Buu Majin buu@babidi.com
9 Frieza Evil frieza@rampage.usa

那是为什么?以及如何使用str_which完成我想完成的工作?

1 个答案:

答案 0 :(得分:1)

根据?str_which,它是包装函数

  

str_which()是对which(str_detect(x,pattern))的包装,等效于grep(pattern,x)。

为了获得相同的输出,我们在pattern中需要一个字符串。他可以使用paste并将其指定为collapse的{​​{1}}参数来创建

|

就像在OP帖子中为table[str_which(table$email, paste(searchvector, collapse="|")), ] # name surname email #3 Jiren San jiren@patrol.ch #8 Buu Majin buu@babidi.com #9 Frieza Evil frieza@rampage.usa #10 Cell San cell@rampage.usa 创建的一样

如果我们将向量用作str_detect中的pattern

str_detect

使用OP的代码返回与table[str_detect(table$email, searchvector),] # name surname email #8 Buu Majin buu@babidi.com #9 Frieza Evil frieza@rampage.usa 中相同的输出

关于str_which的{​​{1}}问题,但是,此处的“电子邮件”和“搜索向量”的vectorization不同。因此,将存在回收问题