使用正则表达式匹配对

时间:2013-11-06 16:52:32

标签: regex perl matching flex-lexer

我想编写一个与以下内容匹配的正则表达式:

C1 to C2 , C2 to C3 , C3 to C4 , C4 to C5 , C5 to C6 , C6 to C7
C1 to C2 , C2 to C3 , C3 to C4 , C4 to C5 , C5 to C6
C1 to C2 , C2 to C3 , C3 to C4 , C4 to C5 
C1 to C2 , C2 to C3 , C3 to C4
C2 to C3 , C3 to C4 , C4 to C5 , C5 to C6 , C6 to C7
C3 to C4 , C4 to C5 , C5 to C6 , C6 to C7
C4 to C5 , C5 to C6 , C6 to C7

无论其

我想以优雅的方式做到这一点,除了完全匹配文本 - c1[ ](to|through)[ ]c2[ ][,][ ]c2[ ](to|through)[ ]c3等。

这是一个词法分析器,用lexx / yacc Regex编写。扫描仪是Flex ++。我希望以1为增量匹配对,但不少于4且不超过7.

为了记录,我广泛搜索了其他帖子,甚至问了几个人。到目前为止没有任何想法。

3 个答案:

答案 0 :(得分:1)

如果你真的使用lex / yacc(或flex / bison),你必须同时使用它们。请原谅我生锈的语法。

软硬度:

"C"[0-9]+  { yylval->num = atoi(yytext+1); return TOKEN_CNUM; }   
"to"       { return TOKEN_TO;      }
"through"  { return TOKEN_TO;      }
","        { return TOKEN_COMMA;   }   
[\n\r]     { return TOKEN_NEWLINE; }

野牛:

line: pair "," pair "," pair "," pair              {assert($1+1 == $3); assert($3+1 == $5); assert($5+1 == $7); }
    | pair "," pair "," pair "," pair "," pair     { /* similar */ }
    | /* for 6 pairs */
    | /* for 7 pairs */
    ;   

pair: TOKEN_CNUM TOKEN_TO TOKEN_CNUM { assert($1+1 == $3); $$ = $3; }                                                 
    ;   

答案 1 :(得分:0)

根据“语义”的正确性,数值必须检查自己。

^c\d+[ ](to|through)[ ]c\d+[ ]([,][ ]c\d+[ ](to|through)[ ]c\d+)*$

这需要额外的处理。

原则上你可以使用

^c\d+[ ](to|through)[ ]((c\d+),\3 ...)*c\d+$
        1          1   23    3  ^    2

这将说明:第三组(此处)(c\d+)必须在逗号\3之后重复。

答案 2 :(得分:0)

 my $pairRE = qr/          # Start regular expression
                 \s*       # zero or more spaces
                 C         # 'C'
                 \d+       # one or more digits
                 \s+       # one or more spaces
                 (         # Start group
                   to        # 'to'
                   |         # or
                   through   # 'through'
                 )         # End group
                 \s+       # one or more spaces
                 C         # 'C'
                 \d+       # one or more digits
                 \s*       # zero or more spaces
                /x;        # End regular expression, eXtended syntax

  while (<DATA>) {
      print
        if /               # Start regular expression
            ^              # Start of line
            $pairRE        # a pair
            (              # Start group
             ,               # ','
             $pairRE         # a pair
            ){3,6}         # End group - match 3 to 6 copies of this group
           /x              # End regular expression, eXtended syntax
  }

__DATA__
C1 to C2 , C2 to C3 , C3 to C4 , C4 to C5 , C5 to C6 , C6 to C7
C1 to C2 , C2 to C3 , C3 to C4 , C4 to C5 , C5 to C6
C1 to C2 , C2 to C3 , C3 to C4 , C4 to C5 
C1 to C2 , C2 to C3 , C3 to C4
C2 to C3 , C3 to C4 , C4 to C5 , C5 to C6 , C6 to C7
C3 to C4 , C4 to C5 , C5 to C6 , C6 to C7
C4 to C5 , C5 to C6 , C6 to C7

打印

C1 to C2 , C2 to C3 , C3 to C4 , C4 to C5 , C5 to C6 , C6 to C7
C1 to C2 , C2 to C3 , C3 to C4 , C4 to C5 , C5 to C6
C1 to C2 , C2 to C3 , C3 to C4 , C4 to C5
C2 to C3 , C3 to C4 , C4 to C5 , C5 to C6 , C6 to C7
C3 to C4 , C4 to C5 , C5 to C6 , C6 to C7