Python - 如何识别cpp文件中未注释的if语句行

时间:2014-12-28 15:02:08

标签: regex python-2.7

我正在编写一个Python代码,用于识别从cpp文件中读取的未注释的 if statement行。我只需要能够判断一行是否包含if语句。

到目前为止,我的尝试很糟糕:

import re
r = re.compile(".*if.*(.*).*")

line1 = "if  ( true )"          # should match
line2= "// if(true)"            # should NOT match
line3= "/* if(true) */"         # should NOT match

print r.search(line1) # match
print r.search(line2) # match
print r.search(line3) # match

我的问题是line2& line3在我错误的正则表达式中匹配。有什么想法吗?

注意

在多行评论中识别if statement无关紧要

2 个答案:

答案 0 :(得分:2)

您可以使用否定前瞻断言。

re.search(r'^(?!/[/*]).*?\bif\s*\(.*?\).*', string)

DEMO

OR

如果需要,在负向前瞻内部的开头添加\s*

^(?!\s*/[/*]).*?\bif\s*\(.*?\).*

DEMO

<强>更新

else

之前不允许使用if字符串
^(?!\s*/[/*])(?:(?!\belse\b).)*\bif\s*\(.*?\).*

DEMO

示例:

>>> import re
>>> line1 = "if  ( true )"
>>> line2= "// if(true)"
>>> line3= "/* if(true) */"
>>> r = re.compile(r'^(?!/[/*]).*?if.*?\(.*?\).*')
>>> r.search(line1)
<_sre.SRE_Match object; span=(0, 12), match='if  ( true )'>
>>> r.search(line2)
>>> r.search(line3)
>>> 

答案 1 :(得分:2)

^(?!\/\/|\/\*.*\*\/$).*if.*

试试这个。看看演示。

https://regex101.com/r/gX5qF3/6

import re
p = re.compile(r'^(?!\/\/|\/\*.*\*\/$).*if.*', re.MULTILINE)
test_str = "\nif ( true ) # should match\n// if(true) # should NOT match\n/* if(true) */"

re.findall(p, test_str)


NODE                     EXPLANATION
--------------------------------------------------------------------------------
  ^                        the beginning of the string
 --------------------------------------------------------------------------------
  (?!                      look ahead to see if there is not:
--------------------------------------------------------------------------------
    \/                       '/'
--------------------------------------------------------------------------------
    \/                       '/'
--------------------------------------------------------------------------------
   |                        OR
--------------------------------------------------------------------------------
    \/                       '/'
--------------------------------------------------------------------------------
    \*                       '*'
--------------------------------------------------------------------------------
    .*                       any character except \n (0 or more times
                             (matching the most amount possible))
--------------------------------------------------------------------------------
    \*                       '*'
--------------------------------------------------------------------------------
    \/                       '/'
--------------------------------------------------------------------------------
    $                        before an optional \n, and the end of
                             the string
--------------------------------------------------------------------------------
  )                        end of look-ahead
--------------------------------------------------------------------------------
  .*                       any character except \n (0 or more times
                           (matching the most amount possible))
--------------------------------------------------------------------------------
  if                       'if'
--------------------------------------------------------------------------------
  .*                       any character except \n (0 or more times
                           (matching the most amount possible))