Question

我试图获得一个可用于多行C注释的正则表达式。管理使其适用于/ *评论* /但如果评论转到下一行则不起作用。如何制作跨越多行的正则表达式？

将此作为我的输入：

/ *此评论
必须得到承认* /

我得到的问题是＆＃34;必须，得到和认可＆＃34;匹配为ID＆* 39和* /作为非法字符。

#!/usr/bin/python
import ply.lex as lex
tokens = ['ID', 'COMMENT']

t_ID   = r'[a-zA-Z_][a-zA-Z0-9_]*'

def t_COMMENT(t):
    r'(?s)/\*(.*?).?(\*/)'
    #r'(?s)/\*(.*?).?(\*/)' does not work either.
    return t

# Error handling rule
def t_error(t):
    print("Illegal character '%s'" % t.value[0])
    t.lexer.skip(1)

lex.lex()   #Build the lexer

lex.input('/* this comment\r\n must be recognised */\r\n')
while True:
    tok = lex.token()
    if not tok:break
    if tok.type == 'COMMENT':
        print tok.type

我尝试了很多：Create array of regex match(multiline)和How to handle multiple rules for one token with PLY以及http://www.dabeaz.com/ply/ply.html

提供的其他一些内容

Answer 1

def t_COMMENT(t):
    r'(?s)/\*.*?\*/'
    return t

如here所述：

(?s)是一个修饰符，它使.也匹配新的换行符
.*?是.*的非贪婪版本。它与最短的字符序列匹配（在接下来的\*/之前）

Answer 2

默认情况下，在PLY词法分析器使用的正则表达式中，点.不会计算新行\n。因此，如果您真的想要对任何角色进行数学运算，请使用(.|\n)代替.

（我遇到了同样的问题，你对自己问题的评论对我有帮助，所以我只为新人创造了一个答案）

python正则表达式找到跨越多行的多行C注释

2 个答案: