正则表达式捕获多个单词

时间:2017-09-28 01:17:57

标签: regex boost

我希望使用捕获或组来捕获下没有的所有接口的名称。这可能吗?

class AClass implements Interface1, Interface2, Interface3,Interface4

我试过了

implements(?:.*?,|[[:space:]]+)[[:word:]]+

及其变体,但它似乎没有效果。 有可能吗?

提前致谢,

1 个答案:

答案 0 :(得分:0)

我不会为此使用正则表达式。

假设您要解析此数据结构:

struct ClassDecl {
    std::string name;
    std::vector<std::string> interfaces;
};

使用Boost Spirit,让我们为标识符定义一个解析规则:

qi::rule<It, std::string()> ident = qi::alpha >> *(qi::alnum | qi::char_('_'));
  

这比“希望[[:word:]]+会做”更准确。

现在可以用:

解析整个类声明
"class" >> ident >> -("implements" >> (ident % ','))

完整演示

<强> Live On Coliru

#include <boost/spirit/include/qi.hpp>
#include <iostream>

struct ClassDecl {
    std::string name;
    std::vector<std::string> interfaces;
};

ClassDecl parse_class_declaration(std::string const& text) {
    namespace qi = boost::spirit::qi;
    using It = std::string::const_iterator;

    ClassDecl result;

    qi::rule<It, std::string()> ident = qi::alpha >> *(qi::alnum | qi::char_('_'));

    if (!qi::phrase_parse(
            text.begin(), text.end(), 
            "class" >> ident >> -("implements" >> (ident % ',')),
            qi::space, result.name, result.interfaces))
    {
        throw std::runtime_error("parse_class_declaration");
    }

    return result;
}

int main() {
    for (auto test_case : {
            "class AClass",
            "class BClass implements Interface1",
            "class CClass implements Interface1, Interface2, Interface3,Interface4",
            })
    {
        auto decl = parse_class_declaration(test_case);
        std::cout << "Parsed '" << decl.name << "' implementing " << decl.interfaces.size() << " interfaces\n";

        for (auto& itf: decl.interfaces) {
            std::cout << " - " << itf << "\n";
        }
    }
}

打印

Parsed 'AClass' implementing 0 interfaces
Parsed 'BClass' implementing 1 interfaces
 - Interface1
Parsed 'CClass' implementing 4 interfaces
 - Interface1
 - Interface2
 - Interface3
 - Interface4