在特定字符之前提取2个术语

时间:2016-10-16 02:38:45

标签: r stringr stringi

我想提取Twitter @handle

之前的两个单词
x <- c("this is a @handle", "My name is @handle", "this string has @more than one @handle")

执行以下操作仅提取 last @handle之前的所有文本,我需要所有@handles

(ext <- stringr::str_extract_all(x, "^.*@"))
[[1]]
[1] "this is a @"

[[2]]
[1] "My name is @"

[[3]]
[1] "this string has @more than one @"

1 个答案:

答案 0 :(得分:3)

您可以使用量词{2}来指定要在字符@之前提取的字词数。单词由单词字符\\w+和单词边界组成,在您的情况下,它将是空格。我们可以使用trimws函数删除不必要的前导和尾随空格:

library(stringr)
lapply(str_extract_all(x, "(\\w+\\s+){2}(?=@)"), trimws)

#[[1]]
#[1] "is a"

#[[2]]
#[1] "name is"

#[[3]]
#[1] "string has" "than one"