理解C ++中的正则表达式11

时间:2014-04-03 14:29:23

标签: c++ regex c++11 ecma

我正在尝试学习C ++ 11中的正则表达式。 必须做错事,因为没有括号或转义序列似乎有效。

这是我的代码:

#include <iostream>
#include <regex>
#include <string>

using namespace std;

int main()
{
    try
    {
        cout << R"(\d*(\.\d*)?;)" << endl << endl;

        regex rx{ R"(\d*(\.\d*)?;)", regex_constants::ECMAScript };
        smatch m;

        if( regex_match( string( "10;20;30;40;" ), m, rx ) )
        {
            cout << m[0];
        }
    }
    catch( const regex_error &e )
    {
        cerr << e.what() << ". Code: " << e.code() << endl;

        switch( e.code() )
        {
        case regex_constants::error_collate:
            cerr << "The expression contained an invalid collating element name.";
            break;
        case regex_constants::error_ctype:
            cerr << "The expression contained an invalid character class name.";
            break;
        case regex_constants::error_escape:
            cerr << "The expression contained an invalid escaped character, or a trailing escape.";
            break;
        case regex_constants::error_backref:
            cerr << "The expression contained an invalid back reference.";
            break;
        case regex_constants::error_brack:
            cerr << "The expression contained mismatched brackets ([ and ]).";
            break;
        case regex_constants::error_paren:
            cerr << "The expression contained mismatched parentheses (( and )).";
            break;
        case regex_constants::error_brace:
            cerr << "The expression contained mismatched braces ({ and }).";
            break;
        case regex_constants::error_badbrace:
            cerr << "The expression contained an invalid range between braces ({ and }).";
            break;
        case regex_constants::error_range:
            cerr << "The expression contained an invalid character range.";
            break;
        case regex_constants::error_space:
            cerr << "There was insufficient memory to convert the expression into a finite state machine.";
            break;
        case regex_constants::error_badrepeat:
            cerr << "The expression contained a repeat specifier (one of *?+{) that was not preceded by a valid regular expression.";
            break;
        case regex_constants::error_complexity:
            cerr << "The complexity of an attempted match against a regular expression exceeded a pre-set level.";
            break;
        case regex_constants::error_stack:
            cerr << "There was insufficient memory to determine whether the regular expression could match the specified character sequence.";
            break;
        default:
            cerr << "Undefined.";
            break;

}

    cerr << endl;
}

    return 0;
}

输出:

  

\ d *(\ d *);

     

regex_error。代码:2

     

表达式包含无效的转义字符或尾随转义。

我做错了什么?

更新

gcc version 4.8.2 20131212(Red Hat 4.8.2-7)(GCC)

clang version 3.3(标签/ RELEASE_33 / final)

libstdc ++版本4.8.2

解决方案

好。我正在阅读&#34; C ++编程语言&#34;并想尝试std :: regex的东西。所以我想解决方法是等待gcc-4.9。

我给EagleV_Attnam指出了我的代码中的其他错误。

2 个答案:

答案 0 :(得分:1)

两件事:

  1. 您的字符串"10;20;30;40;"仅在match_regex来电中定义。与smatch相对的cmatch期望字符串(如string()创建的字符串)在您想要访问它时仍然有效。
  2. 您当前的正则表达式不匹配(至少在我的系统上不匹配)。它试图匹配整个字符串。在最后添加一个。*(并开始,但在你的情况下没有必要)应该修复它,就像让整个事情重复一样(用R"((stuff)*)"
  3. 工作代码(但无法在gcc上尝试):

    regex rx{ R"(\d*(\.\d*)?;.*)", regex_constants::ECMAScript };
    smatch m;
    string s("10;20;30;40;");
    if (regex_match(s, m, rx))
    {
        cout << m[0];
    }
    

    不知道这是否会解决您的特定错误 - 我担心KitsuneYMG的数量是正确的 - 但尝试不应该受到伤害。

答案 1 :(得分:-2)

正则表达式的一个问题是你没有转义\并且\ d在字符串的上下文中不是有效的转义序列。我不确定你是否可以在字符串上使用R标识符,但我没有定义。

上次检查时,GCC的正则表达式也不完整。所以你可能被迫使用boost regexp。

    regex rx( "\\d*;" ); //regexp, must escape '\'
    string input = "10;20;30;40;";
    smatch m;

    if( regex_search( input, m, rx ) )
    {
        cout << m[0] << endl;
    }