Question

我正在尝试使用正则表达式来提取文件标题中的注释。

例如，源代码可能如下所示：

//This is an example file.
//Please help me.

#include "test.h"
int main() //main function
{
  ...
}

我想从代码中提取的是前两行，即

//This is an example file.
//Please help me.

有什么想法吗？

Answer 1

为什么要使用正则表达式？

>>> f = file('/tmp/source')
>>> for line in f.readlines():
...    if not line.startswith('//'):
...       break
...    print line
...

Answer 2

>>> code="""//This is an example file.
... //Please help me.
...
... #include "test.h"
... int main() //main function
... {
...   ...
... }
... """
>>>
>>> import re
>>> re.findall("^\s*//.*",code,re.MULTILINE)
['//This is an example file.', '//Please help me.']
>>>

如果您只需要在顶部匹配连续注释行，则可以使用以下内容。

>>> re.search("^((?:\s*//.*\n)+)",code).group().strip().split("\n")
['//This is an example file.', '//Please help me.']
>>>

Answer 3

这不只是获得前两个评论行，而且还有后面的多行和//评论。它不是你所要求的。

data=open("file").read()
for c in data.split("*/"):
    # multiline
    if "/*" in c:
       print ''.join(c.split("/*")[1:])
    if "//" in c:
       for item in c.split("\n"):
          if "//" in c:
             print ''.join(item.split("//")[1:])

Answer 4

将上下文扩展到以下注意事项

// ...
每个// ...行之间的空行

import re

code = """//This is an example file.    
 a
   //  Please help me.

//  ha

#include "test.h"
int main() //main function
{
  ...
}"""

for s in re.finditer(r"^(\s*)(//.*)",code,re.MULTILINE):
    print(s.group(2))

>>>
//This is an example file.    
//  Please help me.
//  ha

正则表达问题

4 个答案: