`re.sub(pattern,functor,string)`for C ++

时间:2014-09-20 05:56:39

标签: c++ regex string std

Python regexp有一个有用的功能,能够使用函数来确定替换。也就是说,re.sub(pattern, functor, string)会将匹配结果传递给仿函数以获取要使用的替换字符串。这比使用`\ 1',' \ 2'的格式字符串语法更灵活。引用子匹配。

现在,我想在C ++中实现同样的目标,我不知道该怎么做。第一个想法是使用std::regex_replace,但它没有允许传递函子的重载。另一个想法是使用迭代器将文本拆分为类型为MATCHNOT_MATCH的标记,但似乎标准的正则表达式迭代器只返回一种类型。他们要么跳过所有不匹配,要么跳过所有比赛。

有什么办法吗?我更喜欢标准库。

2 个答案:

答案 0 :(得分:1)

您可以使用匹配结果的.prefix()来获取字符串的不匹配前缀部分,并使用.suffix()来获取字符串的非匹配其余部分。

Demo(改编自here)。

答案 1 :(得分:0)

我在这里写了一篇关于这个主题的博文:http://blog.brainstembreakfast.com/update/c++/2014/09/20/regex-replace-ext/

您正在寻找的功能是

template< class Traits, class CharT,
        class STraits, class SAlloc >
  inline std::basic_string<CharT,STraits,SAlloc> 
  regex_replace_ext( const std::basic_string<CharT,STraits,SAlloc>& s,
             const std::basic_regex<CharT,Traits>& re,
             const typename std::common_type<std::function<std::basic_string<CharT,STraits,SAlloc> 
             (const unsigned, const std::basic_string<CharT,STraits,SAlloc> &)>>::type& fmt,
              std::regex_constants::match_flag_type flags =
              std::regex_constants::match_default)
  {
    std::vector<int> smatches{-1};
    if(re.mark_count() == 0)
    smatches.push_back(0);
    else
      {
    smatches.resize(1+re.mark_count());
    std::iota(std::next(smatches.begin()), smatches.end(), 1); //-1, 1, 2, etc...    
      }

    unsigned smatch_count = smatches.size();
    unsigned count = 0;

    std::regex_token_iterator
      <typename std::basic_string<CharT,STraits,SAlloc>::const_iterator> 
      tbegin(s.begin(), s.end(), re, smatches, flags), tend;            

    std::basic_stringstream<CharT,STraits,SAlloc> ret_val;
    std::for_each(tbegin, tend, [&count,&smatch_count,&ret_val,&fmt]
          (const std::basic_string<CharT,STraits,SAlloc> & token)
          {
            if(token.size() != 0)
              {
            if(!count) 
              ret_val << token;
            else
              ret_val << fmt(count,token);
              }
            count = ++count % smatch_count;
          });
    return ret_val.str();
  }

用法:

    const std::string bss("{Id_1} [Fill_0] {Id_2} [Fill_1] {Id_3} {Id_4} {Id_5}.");
    const std::regex re("(\\{.*?\\})|(\\[.*?\\])");
    using dictionary = std::map<std::string,std::string>;
    const std::vector<const dictionary> dict
      {
    {
      {"{Id_1}","This"},
      {"{Id_2}","test"},
      {"{Id_3}","my"},
      {"{Id_4}","favorite"},
      {"{Id_5}","hotdog"}
    },
    {
      {"[Fill_0]","is a"},
      {"[Fill_1]","of"}
    }
      };

    auto fmt1 = [&dict](const unsigned smatch, const std::string & s)->std::string
      {
        auto dict_smatch = smatch - 1;
        if(dict_smatch > dict.size()-1)
           return s; //more submatches than expected

        const auto it = dict[dict_smatch].find(s); 
        return it != dict[dict_smatch].cend() ? it->second : s;
      };

    std::string modified_string = regex_replace_ext(bss, re, fmt1);