如何获得未知数量的正则表达式匹配?

时间:2014-12-11 09:17:20

标签: c++ regex c++14

我试图在字符串中找到几个数字位置。我只能获得最后一个或之前指定的位数:

#include <iostream>
#include <regex>
#include <string>

int main()
{
    std::string s("aaabbbccd123456eeffgg");
    std::smatch match;
    std::regex braced_regex("(\\w+)(\\d{2,})(\\w+)");
    std::regex plus_regex("(\\w+)(\\d+)(\\w+)");

    auto printer = [](auto& match) {
            std::ssub_match sub(match);
            std::string match_substring(sub.str());
            std::cout <<  match_substring << '\n';
    };

    std::regex_match(s, match, braced_regex);
    std::cout << "Number of braced matches: " << match.size() << '\n';  
    std::for_each(match.begin(), match.end(), printer);

    std::regex_match(s, match, plus_regex);
    std::cout << "Number of plus matches: " << match.size() << '\n';  
    std::for_each(match.begin(), match.end(), printer);
    return 0;
}

结果:

Number of braced matches: 4
aaabbbccd123456eeffgg
aaabbbccd1234
56
eeffgg
Number of plus matches: 4
aaabbbccd123456eeffgg
aaabbbccd12345
6
eeffgg

如何从提供的字符串中获取整数序列,即123456?

3 个答案:

答案 0 :(得分:2)

我认为问题在于这些数字被视为单词部分并与\w匹配。我很想使用\D含义而不是数字

#include <iostream>
#include <regex>
#include <string>

int main()
{
    std::string s("aaabbbccd123456eeffgg");
    std::smatch match;
    std::regex plus_regex("(\\D+)(\\d+)(\\D+)");

    auto printer = [](auto& match) {
            std::ssub_match sub(match);
            std::string match_substring(sub.str());
            std::cout <<  match_substring << '\n';
    };

    std::regex_match(s, match, plus_regex);
    std::cout << "Number of plus matches: " << match.size() << '\n';
    std::for_each(match.begin(), match.end(), printer);
    return 0;
}

<强>输出:

Number of plus matches: 4
aaabbbccd123456eeffgg
aaabbbccd
123456
eeffgg

另一种可能性(取决于你想要的)是使用std::regex_search(),它不会尝试匹配整个字符串,但允许你匹配中间的元素:

#include <iostream>
#include <regex>
#include <string>

int main()
{
    std::string s("aaabbbccd123456eeffgg");
    std::smatch match;
    std::regex braced_regex("\\d{2,}"); // just the numbers

    auto printer = [](auto& match) {
            std::ssub_match sub(match);
            std::string match_substring(sub.str());
            std::cout <<  match_substring << '\n';
    };

    std::regex_search(s, match, braced_regex); // NOTE: regex_search()!
    std::cout << "Number of braced matches: " << match.size() << '\n';
    std::for_each(match.begin(), match.end(), printer);
}

<强>输出:

Number of braced matches: 1
123456

答案 1 :(得分:2)

([a-zA-Z]+)(\\d{2,})([a-zA-Z]+)

你可以试试这个。\w === [a-zA-Z0-9_]。所以\w+会匹配最大值。所以它让\d{2,}只有2。

(\\w+?)(\\d{2,})(\\w+)

让第一个\w非贪婪。请参阅live demo

答案 2 :(得分:1)

在:

(\\w+)(\\d{2,})(\\w+)

\\w+匹配任何单词字符[a-zA-Z0-9_],因此它也匹配1234

将整数更改\\w与[a-zA-Z_]匹配,因此您将拥有:

std::regex braced_regex("([a-zA-Z_]+)(\\d{2,})(\\w+)");