Question

如何从以下代码中的字符串s中提取Test和Again 。目前我使用的是regex_iterator，它似乎不是正则表达式中的匹配组，我在输出中得到{{Test}}和{{Again}}。

#include <regex>
#include <iostream>

int main()
{
    const std::string s = "<abc>{{Test}}</abc><def>{{Again}}</def>";
    std::regex rgx("\\{\\{(\\w+)\\}\\}");
    std::smatch match;
    std::sregex_iterator next(s.begin(), s.end(), rgx);
    std::sregex_iterator end;
    while (next != end) {
      std::smatch match = *next;
      std::cout << match.str() << "\n";
      next++;
    } 
    return 0;
}

我也尝试过使用regex_search，但它没有使用多种模式，只提供测试输出

#include <regex>
#include <iostream>

int main()
{
    const std::string s = "<abc>{{Test}}</abc><def>{{Again}}</def>";
    std::regex rgx("\\{\\{(\\w+)\\}\\}");
    std::smatch match;

    if (std::regex_search(s, match, rgx,std::regex_constants::match_any))
    {
        std::cout<<"Match size is "<<match.size()<<std::endl;
        for(auto elem:match)
        std::cout << "match: " << elem << '\n';
    }
}

另外作为旁注，为什么需要两个反斜杠来逃避{或}

Answer 1

要访问捕获组的内容，您需要使用.str(1)：

std::cout << match.str(1) << std::endl;

请参阅C++ demo：

#include <regex>
#include <iostream>

int main()
{
    const std::string s = "<abc>{{Test}}</abc><def>{{Again}}</def>";
    // std::regex rgx("\\{\\{(\\w+)\\}\\}");
    // Better, use a raw string literal:
    std::regex rgx(R"(\{\{(\w+)\}\})");
    std::smatch match;
    std::sregex_iterator next(s.begin(), s.end(), rgx);
    std::sregex_iterator end;
    while (next != end) {
      std::smatch match = *next;
      std::cout << match.str(1) << std::endl;
      next++;
    } 
    return 0;
}

输出：

Test
Again

请注意，您不必使用双反斜杠在 raw 字符串文字（此处为R"(pattern_here)"）中定义正则表达式转义序列。

regex_iterator与正则表达式

1 个答案: