如何匹配连续重复次数超过n次的模式?

时间:2016-05-22 23:30:20

标签: regex r

我试图在这个样本载体上匹配“城市,州”(例如“奥斯汀,德克萨斯州”)的模式

> s <- c("Austin, TX", "Forth Worth, TX", "Ft. Worth, TX", 
"Austin TX", "Austin, TX, USA", "Ft. Worth, TX, USA")

> grepl('[[:alnum:]], [[:alnum:]$]', s)
[1]  TRUE  TRUE  TRUE FALSE  TRUE  TRUE

但是,有两种情况我想检索FALSE:

- 当有超过1个逗号(即"Austin, TX, USA"

- 逗号前有另一个标点符号(即"Ft. Worth, TX"

2 个答案:

答案 0 :(得分:3)

您可以使用以下正则表达式模式:

str = "m y     r e a l      n a m e  i s  d o n a l d  d u c k"

str.gsub(/\s{3,}/, "  ").gsub(/\s(?!\s)/,'')
  #=> "my real name is donald duck"

Regex101 Demo

正则表达式说明:

grepl("^[a-z ]+, [a-z]+$", subject, perl=TRUE, ignore.case=TRUE);

答案 1 :(得分:1)

这是RegEx:^([^。,])+,\ s([^。,])+ $

^ assert position at start of the string
1st Capturing group ([^\.,])+
    Quantifier: + Between one and unlimited times, as many times as possible, giving back as needed [greedy]
    Note: A repeated capturing group will only capture the last iteration. Put a capturing group around the repeated group to capture all iterations or use a non-capturing group instead if you're not interested in the data
    [^\.,] match a single character not present in the list below
        \. matches the character . literally
        , the literal character ,
, matches the character , literally
\s match any white space character [\r\n\t\f ]
2nd Capturing group ([^\.,])+
    Quantifier: + Between one and unlimited times, as many times as possible, giving back as needed [greedy]
    Note: A repeated capturing group will only capture the last iteration. Put a capturing group around the repeated group to capture all iterations or use a non-capturing group instead if you're not interested in the data
    [^\.,] match a single character not present in the list below
        \. matches the character . literally
        , the literal character ,
$ assert position at end of the string