Question

我尝试制作正则表达式，这有助于我过滤像

这样的字符串

blah_blah_suffix

其中suffix是长度为2到5个字符的任何字符串。所以我想要接受字符串

blah_blah_aa
blah_blah_abcd

但丢弃

blah_blah_a
blah_aaa
blah_blah_aaaaaaa

我以下列方式使用grepl：

samples[grepl("blah_blah_.{2,5}", samples)]

但它忽略了重复的上限（5）。所以它丢弃了字符串blah_blah_a， blah_aaa，但接受字符串blah_blah_aaaaaaa。

我知道有一种方法可以在不使用正则表达式的情况下过滤字符串，但我想了解如何正确使用grepl。

Answer 1

您需要将表达式绑定到行的开头和结尾：

^blah_blah_.{2,5}$

^匹配行首和$匹配行尾。请在此处查看工作示例：Regex101

如果要将表达式绑定到字符串的开头和结尾（而不是多行），请使用\A和\Z代替^和$

Anchors Tutorial

Answer 2

/^[\w]+_[\w]+_[\w]{2,5}$/

DEMO

Options: dot matches newline; case insensitive; ^ and $ match at line breaks

Assert position at the beginning of a line (at beginning of the string or after a line break character) «^»
Match a single character that is a “word character” (letters, digits, and underscores) «[\w]+»
   Between one and unlimited times, as many times as possible, giving back as needed (greedy) «+»
Match the character “_” literally «_»
Match a single character that is a “word character” (letters, digits, and underscores) «[\w]+»
   Between one and unlimited times, as many times as possible, giving back as needed (greedy) «+»
Match the character “_” literally «_»
Match a single character that is a “word character” (letters, digits, and underscores) «[\w]{2,5}»
   Between 2 and 5 times, as many times as possible, giving back as needed (greedy) «{2,5}»
Assert position at the end of a line (at the end of the string or before a line break character) «$»

R正则表达式重复忽略上限

2 个答案: