查找一定长度且包含特定字符的字符串

时间:2018-11-14 23:22:11

标签: r string grepl

样本数据

a<-c("hour","four","ruoh", "six", "high", "our")

我想找到所有包含o&u&h&的字符串,它们都是4个字符,但是顺序无关紧要。

我想退回"hour","four","ruoh" 这是我的尝试

grepl("o+u+r", a) nchar(a)==4

3 个答案:

答案 0 :(得分:2)

要匹配包含字符 h o u 长度4 的字符串,请使用:

Wendy <- Carlos <- 1:6

Summary(x=Wendy, y=Carlos)

Results for the variables
    Wendy and Carlos

The total square sum is: 17.5

The error square sum is: 0

Warning message:
Error square sum is zero 
  • grepl("(?=^.{4}$)(?=.*h)(?=.*o)(?=.*u)", c("hour","four","ruoh", "six", "high", "our"), perl = TRUE) [1] TRUE FALSE TRUE FALSE FALSE FALSE FALSE FALSE :字符串的长度为4。
  • (?=^.{4}$)(?=.*x)出现在字符串中的任何位置。

答案 1 :(得分:1)

您可以使用strsplitsetdiff,我在示例数据中添加了一个额外的边沿大小写:

a<-c("hour","four","ruoh", "six", "high", "our","oouh")
a[nchar(a) == 4 &
  lengths(lapply(strsplit(a,""),function(x) setdiff(x, c("o","u","h")))) == 1]
# [1] "hour" "ruoh"

grepl

a[nchar(a) == 4 & !rowSums(sapply(c("o","u","h"), Negate(grepl), a))]
# [1] "hour" "ruoh" "oouh"

sapply(c("o","u","h"), Negate(grepl), a)为您提供一个矩阵,该矩阵的单词不包含每个字母,然后rowSums的作用类似于行应用的any,因为它将被强制为逻辑。

答案 2 :(得分:1)

将grepl与已编辑的方法一起使用(用r代替h):

a<-c("hour","four","ruoh", "six", "high", "our")

a[grepl(pattern="o", x=a) & grepl(pattern="u", x=a) & grepl(pattern="r", x=a) & nchar(a)==4]

返回:

[1] "hour" "four" "ruoh"