RegEx查找文本中URL的所有匹配项

时间:2016-03-24 12:23:34

标签: c++ regex url

我想从字符串中提取所有网址。我在this thread

中找到了完美的RegEx

现在我需要帮助来迭代所有比赛。我还看了this example(在底部),但我不能按照我想要的方式工作

基本上我想迭代所有匹配,就像第二个例子一样,我也想访问第一个例子中的子匹配(5& 8)。

目前我只获得第一场比赛。我怎么能得到余下的?

unsigned counter = 0;
std::string urls = "www.google.de/test.php&id=2#anker stackoverflow www.test.com please work example.com/test";
std::regex word_regex(
        R"(^(([^:\/?#]+):)?(//([^\/?#]*))?([^?#]*)(\?([^#]*))?(#(.*))?)",
        std::regex::extended
);
auto words_begin = std::sregex_iterator(urls.begin(), urls.end(), word_regex);
auto words_end = std::sregex_iterator();

for (std::sregex_iterator i = words_begin; i != words_end; ++i) {
    std::smatch match = *i;
    std::string match_str = match.str();
    for (const auto& res : match) {
        std::cout << counter++ << ": " << res << std::endl;
    }
    std::cout << "  " << match_str << '\n';
}

输出:

0: www.google.de/test.php&id=2#anker stackoverflow www.test.com please work example.com/test
1: 
2: 
3: 
4: 
5: www.google.de/test.php&id=2
6: 
7: 
8: #anker stackoverflow www.test.com please work example.com/test
9: anker stackoverflow www.test.com please work example.com/test
www.google.de/test.php&id=2#anker stackoverflow www.test.com please work example.com/test

0 个答案:

没有答案