Question

我正在尝试创建一个最多匹配7个组的正则表达式。

((X:){1,6})((:Y){1,6})

X:X:X:X:X::Y:Y             This should match
X:X:X:X:X:X::Y:Y           This should not match.

https://regex101.com/r/zxfAB7/16

有没有办法做到这一点？我需要捕获组$ 1和$ 3
我正在使用C ++ 17正则表达式。

Answer 1

如果支持正向前瞻，则可以使用正向前瞻来断言X:或:Y的8个重复。

为防止空匹配，您可以使用正向前行检查是否至少有1个匹配。

然后使用2个捕获组，在其中重复0+次，或者在第一组a中匹配X:，在0+次中与其他组匹配:Y。

^(?=(?:X:|:Y))(?!(?:(?:X:|:Y)){8})((?:X:)*)((?::Y)*)$

^字符串的开头
(?=正向查找，断言右边是
- (?:X:|:Y)匹配X:或:Y
)积极回望
(?!负前瞻，断言匹配X:或:Y的次数不超过8次
- (?:(?:X:|:Y)){8}
)近距离否定预测
((?:X:)*)捕获组1匹配0次以上X:
((?::Y)*)捕获组2匹配0次以上:Y
$字符串结尾

Regex demo

Answer 2

正如Ulrich所说，仅使用正则表达式可能不是解决方案。我建议您：

Replace all X (occuring 1 to 6 times) by an empty string
Replace all Y (occuring 1 to 6 times) by an empty string
Use regex for determining if any X is still present
Use regex for determining if any Y is still present

如果所有X或Y仅出现1至6次，则找不到X或Y（返回match），否则返回no match。

Answer 3

尽管已经有一个公认的答案，但我想展示一个非常简单的解决方案。经过C ++ 17测试。以及完整的运行源代码。

由于我们正在讨论最多7个群组，因此我们可以简单地将它们全部列出并“或”它们。这可能是很多文字和复杂的DFA。但是应该可以。

找到匹配项后，我们定义一个向量并将所有数据/组放入其中，并显示所需的结果。这真的很简单：

请参阅：

#include <iostream>
#include <string>
#include <iterator>
#include <vector>
#include <regex>

std::vector<std::string> test{
    "X::Y",
    "X:X::Y",
    "X:X::Y:Y",
    "X:X:X::Y:Y",
    "X::Y:Y:Y:Y:Y",
    "X:X:X:X:X::Y:Y",
    "X:X:X:X:X:X::Y:Y"
};

const std::regex re1{ R"((((X:){1,1}(:Y){1,6})|((X:){1,2}(:Y){1,5})|((X:){1,3}(:Y){1,4})|((X:){1,4}(:Y){1,3})|((X:){1,5}(:Y){1,2})|((X:){1,6}(:Y){1,1})))" };
const std::regex re2{ R"(((X:)|(:Y)))" };

int main() {
    std::smatch sm;
    // Go through all test strings
    for (const std::string s : test) {
        // Look for a match
        if (std::regex_match(s, sm, re1)) {
            // Show succes message
            std::cout << "Match found for  -->  " << s << "\n";
            // Get all data (groups) into a vector
            std::vector<std::string> data{ std::sregex_token_iterator(s.begin(), s.end(),re2,1),  std::sregex_token_iterator() };
            // Show desired groups
            if (data.size() >= 6) {
                std::cout << "Group 1: '" << data[0] << "'   Group 6: '" << data[5] << "'\n";
            }
        }
        else {
            std::cout << "**** NO match found for  -->  " << s << "\n";
        }
    }
    return 0;
}

最多可匹配n个字符

3 个答案: