我试图在这个给定的字符串中提取包含两个相邻元音的所有单词。
x <- "The team sat next to each other all year and still failed."
结果为"team", "each", "year", "failed"
到目前为止,我已尝试使用[aeiou][aeiou]
与regmatches
一起使用,但它只给了我部分内容。
感谢。
答案 0 :(得分:5)
您可以在字符类之前和之后放置\w*
以匹配“零个或多个”字符。
x <- "The team sat next to each other all year and still failed."
regmatches(x, gregexpr('\\w*[aeiou]{2}\\w*', x))[[1]]
# [1] "team" "each" "year" "failed"
答案 1 :(得分:4)
words <-unlist(strsplit(x, " "))
words[grepl("[aeiou]{2}", words)]
#[1] "team" "each" "year" "failed."
如果你想清理点状物,可能是:
> words <-unlist(strsplit(x, "[[:punct:] ]"))
> words[grepl("[aeiou]{2}", words)]
答案 2 :(得分:1)
答案 3 :(得分:1)
与stringr
library(stringr)
xx <- str_split(x, " ")[[1]]
xx[str_detect(xx, "[aeiou]{2}")]
## [1] "team" "each" "year" "failed."
正如@akrun强调的那样,可以将其简化为
str_extract_all(x, "\\w*[aeiou]{2}\\w*")[[1]]
## [1] "team" "each" "year" "failed"