Question

例如，这是正则表达式

([a]{2,3})

这是字符串

aaaa // 1 match "(aaa)a" but I want "(aa)(aa)"
aaaaa // 2 match "(aaa)(aa)"
aaaaaa // 2 match "(aaa)(aaa)"

但是，如果我更改了正则表达式

([a]{2,3}?)

那么结果是

aaaa // 2 match "(aa)(aa)"
aaaaa // 2 match "(aa)(aa)a" but I want "(aaa)(aa)"
aaaaaa // 3 match "(aa)(aa)(aa)" but I want "(aaa)(aaa)"

我的问题是，可以使用尽可能少的组来匹配尽可能长的字符串吗？

Answer 1

怎么样呢？

(a{3}(?!a(?:[^a]|$))|a{2})

这会寻找字符a三次（而不是一个a和另一个字符）或{ {1}}两次。

故障：

这里是demo。

请注意，如果您不需要捕获组，则实际上可以使用整个匹配，而无需将捕获组转换为非捕获组：

(                   # Start of the capturing group.
    a{3}            # Matches the character 'a' exactly three times.
    (?!             # Start of a negative Lookahead.
        a           # Matches the character 'a' literally.
        (?:         # Start of the non-capturing group.
            [^a]    # Matches any character except for 'a'.
            |       # Alternation (OR).
            $       # Asserts position at the end of the line/string.
        )           # End of the non-capturing group.
    )               # End of the negative Lookahead.
    |               # Alternation (OR).
    a{2}            # Matches the character 'a' exactly two times.
)                   # End of the capturing group.

Which would look like this。

Answer 2

尝试此正则表达式：

^(?:(a{3})*|(a{2,3})*)$

Click for Demo

说明：

^-断言行的开头
(?:(a{3})*|(a{2,3})*)-一个非捕获组，包含2个由OR运算符分隔的子序列
- (a{3})*-第一个子序列尝试匹配3次出现的a。末尾的*允许此子序列匹配0或3或6或9。...在行尾之前出现a
- |-或
- (a{2,3})*-尽可能多地匹配2至3次出现的a。末尾的*会在行尾之前重复0次以上