我有以下正则表达式删除多行评论,但我很难弄清楚如何删除以//
开头的评论。
当我添加(//.*
)作为正则表达式时,它似乎永远不会起作用。
pattern = r"""
## --------- COMMENT ---------
/\* ## Start of /* ... */ comment
[^*]*\*+ ## Non-* followed by 1-or-more *'s
( ##
[^/*][^*]*\*+ ##
)* ## 0-or-more things which don't start with /
## but do end with '*'
/ ## End of /* ... */ comment
##
| ## --------- COMMENT ---------
(//.*) ## Start of // comment
##
| ## -OR- various things which aren't comments:
( ##
## ------ " ... " STRING ------
" ## Start of " ... " string
( ##
\\. ## Escaped char
| ## -OR-
[^"\\] ## Non "\ characters
)* ##
" ## End of " ... " string
| ## -OR-
##
## ------ ' ... ' STRING ------
' ## Start of ' ... ' string
( ##
\\. ## Escaped char
| ## -OR-
[^'\\] ## Non '\ characters
)* ##
' ## End of ' ... ' string
| ## -OR-
##
## ------ ANYTHING ELSE -------
. ## Anything other char
[^/"'\\]* ## Chars which doesn't start a comment, string
) ## or escape
"""
有人可以告诉我哪里出错了吗? 我甚至尝试了以下正则表达式:
//[^\r\n]*$
但这也不起作用。
答案 0 :(得分:1)
尝试其中一个......
他们都捕获评论和非评论。
这个不保留格式并使用无修饰符。
从find while循环中,将Group 1(注释)存储在新文件中,
替换原始文件中的第2组(非注释)
根据需要调整正则表达式换行符。 IE浏览器。将\n
更改为\r\n
等...
# (/\*[^*]*\*+(?:[^/*][^*]*\*+)*/|//(?:[^\\]|\\\n?)*?\n)|("(?:\\[\S\s]|[^"\\])*"|'(?:\\[\S\s]|[^'\\])*'|[\S\s][^/"'\\]*)
( # (1 start), Comments
/\* # Start /* .. */ comment
[^*]* \*+
(?: [^/*] [^*]* \*+ )*
/ # End /* .. */ comment
|
// # Start // comment
(?: [^\\] | \\ \n? )*? # Possible line-continuation
\n # End // comment
) # (1 end)
|
( # (2 start), Non - comments
"
(?: \\ [\S\s] | [^"\\] )* # Double quoted text
"
| '
(?: \\ [\S\s] | [^'\\] )* # Single quoted text
'
| [\S\s] # Any other char
[^/"'\\]* # Chars which doesn't start a comment, string, escape,
# or line continuation (escape + newline)
) # (2 end)
上次返工 -
保存格式是否更好。
有关换行符的格式问题从注释尾部开始解决
虽然这解决了字符串连接的问题,但它确实留下了偶尔的空白
评论所在的行。对于%98的评论,这不会是一个问题
但是,是时候把这只死狗独自留下了。
这个保留格式。它使用正则表达式修饰符多行(请务必设置)
与上述相同。
这假设您的引擎支持\h
水平制表符。如果不让我知道。
根据需要调整正则表达式换行符。 IE浏览器。将\n
更改为\r\n
等...
# ((?:(?:^\h*)?(?:/\*[^*]*\*+(?:[^/*][^*]*\*+)*/(?:\h*\n(?=\h*(?:\n|/\*|//)))?|//(?:[^\\]|\\\n?)*?(?:\n(?=\h*(?:\n|/\*|//))|(?=\n))))+)|("(?:\\[\S\s]|[^"\\])*"|'(?:\\[\S\s]|[^'\\])*'|[\S\s][^/"'\\\s]*)
( # (1 start), Comments
(?:
(?: ^ \h* )? # <- To preserve formatting
(?:
/\* # Start /* .. */ comment
[^*]* \*+
(?: [^/*] [^*]* \*+ )*
/ # End /* .. */ comment
(?:
\h* \n
(?= # <- To preserve formatting
\h* # <- To preserve formatting
(?: \n | /\* | // ) # <- To preserve formatting
)
)? # <- To preserve formatting
|
// # Start // comment
(?: [^\\] | \\ \n? )*? # Possible line-continuation
(?: # End // comment
\n
(?= # <- To preserve formatting
\h* # <- To preserve formatting
(?: \n | /\* | // ) # <- To preserve formatting
)
| (?= \n )
)
)
)+ # Grab multiple comment blocks if need be
) # (1 end)
| ## OR
( # (2 start), Non - comments
"
(?: \\ [\S\s] | [^"\\] )* # Double quoted text
"
| '
(?: \\ [\S\s] | [^'\\] )* # Single quoted text
'
| [\S\s] # Any other char
[^/"'\\\s]* # Chars which doesn't start a comment, string, escape,
# or line continuation (escape + newline)
) # (2 end)