Question

让

exp = ^ [0-9！@＃$％^＆amp; *（）_ + - = [] {};'：“\ |，。＆lt;＆gt; /？\ s] * $

是一个正则表达式，允许我查找包含或不包含特殊字符的所有数字序列。

使用exp我设法提取大于 5 的所有数字序列。但无法提取 98200 的数字。我没有使用任何限制数字序列应该多长时间。源代码：

#include <boost/regex.hpp>
#include iostream;

using namespace std;

int main()
{
   string s = "16000";
   string exp = ^[0-9!@#$%^&*()_+-=[]{};':"\\|,.<>\\/?\\s]*$
   const boost::regex e(exp);
   bool isSequence = boost::regex_match(s,e);
   //isSequence is boolean and should be equal to 1 
   cout << isSequence << endl;

  return 0;

}

Answer 1

不分青红皂白地逃避一切对我有用.. :)

string exp = "^[0-9\\!@#\\$\\%\\^&*\\(\\)_\\+\\-=\\[\\]\\{\\};\\\':\\\"\\\\|,\\.<>\\/?\\s]*$";

请注意双反斜杠......我相信你可以锻炼你的列表中的哪些字符意味着什么特别的东西，只能逃避那些，因为我没有时间查找在这种情况下具有特殊意义的东西，我逃脱了一切，这对于我测试的一些案例来说效果很好

16000 =＆gt;返回1 16A000 =＆gt;返回0 16 @ 000 =＆gt;返回1

我猜是你想要的......

Answer 2

我已将括号移到字符类的前面，然后使用以下代码获取98200的输出1：

#include <string>
#include <boost/regex.hpp>
#include <iostream>

using namespace std;

int main()
{
    std::cout << "main()\n";
    string s = "98200";
    string exp = "^[][0-9!@#$%^&*()_+-={};':\"\\|,.<>\\/?\\s]*$";
    const boost::regex e(exp);
    bool isSequence = boost::regex_match(s,e);
    //isSequence is boolean and should be equal to 1 
    cout << isSequence << endl;

  return 0;
}

/**
     Local Variables:
     compile-command: "g++ -g test.cc -o test.exe -lboost_regex-mt; ./test.exe"
     End:
 */

编辑：注意，我使用了常规emacs的经验表达式。 emacs的信息页面解释说：“要在{。}中加入] 字符集，你必须使它成为第一个字符。“我试过这个与boost::regexp一起工作。后来我有更多时间阅读在增强手册中 http://www.boost.org/doc/libs/1_55_0/libs/regex/doc/html/boost_regex/syntax/perl_syntax.html#boost_regex.syntax.perl_syntax.character_sets 这没有为perl正则表达式语法指定。 perl语法是boost::regex的标准设置。根据规范评论 https://stackoverflow.com/users/2872922/ron-rosenfeld是最好的回答。在下面的程序中，我消除了偶然编码到正则表达式中的字符范围。测试显示字符集开头的括号包含在字符集中。事实证明，即使boost::regex的官方手册中未指明，我的陈述也是正确的。

尽管如此，我建议https://stackoverflow.com/users/2872922/ron-rosenfeld将他的评论作为答案插入，并将其标记为解决方案。这将有助于其他人阅读这个帖子。

#include <string>
#include <boost/regex.hpp>
#include <iostream>

using namespace std;

int main()
{
    std::cout << "main()\n";
    string s = "98-[2]00";
    string exp = "^[][0-9!@#$%^&*()_+={};':\"|,.<>/?\\s-]*$";
    const boost::regex e(exp);
    bool isSequence = boost::regex_match(s,e);
    //isSequence is boolean and should be equal to 1 
    cout << isSequence << endl;

  return 0;
}

/**
     Local Variables:
     compile-command: "g++ -g test.cc -o test.exe -lboost_regex-mt; ./test.exe"
     End:
*/

我问http://lists.boost.org/boost-users/2013/12/80707.php John Maddock（boost::regex库的作者）的答案是：

>I discovered that if one uses an closing bracket as the first character of
>a
>character class the character class includes this bracket.
>This works with the standard setting of boost::regex (i.e., perl-regular
>expressions) but it is not documented in the
>manual page
>
>http://www.boost.org/doc/libs/1_55_0/libs/regex/doc/html/boost_regex/syntax/
>perl_syntax.html#boost_regex.syntax.perl_syntax.character_sets
>
>Is this an undocumented feature, a bug or did I misinterpret something in
>the manual?

It's a feature, both Perl and POSIX extended regular expression behave the
same way.

John.

Answer 3

在C＃中，你需要逃避]。当它们在字符类中时，您不需要转义[{}（）。此外，如果要将短划线包含在字符类中作为包含字符，则它应位于列表的开头或结尾。您所拥有的序列+ - =转换为[+， - 。/ 0123456789：;＆lt; =]，这使您的正则表达式变得多余。最后，由于终端量词，您允许匹配零长度字符串。这可能是你想要的，但如果没有，请考虑'+'量词。

简单地说

[^A-Za-z]+

在开头/结尾有或没有^ $锚点

正则表达式无法找到数字序列

3 个答案: