Python正则表达式 - 正则表达式,每行找到一个模式?

时间:2013-06-10 19:45:28

标签: python regex expression findall

说我在testFile文件中有以下几行:

Test Line in File
Test Line in File
Test Line in File
Test Line in File Line
Test Line in File Line

是否可以执行re.findall(),这将允许我找到一个'模式'的实例。每行?例如,如果我执行len(re.findall(" Line",testfile,0)),程序将返回7.我希望它返回5.我想的是" Line。* \ n",但这仍然会返回7.只是为了澄清,我想避免使用:

count = 0
with open(testFile, "r") as file:
    for line in file:
        re.match(pattern, testFile, 0)
        #etc

感谢任何帮助。

3 个答案:

答案 0 :(得分:1)

对于这样一个简单的匹配,使用它更有效...

count = 0
with open(testFile, "r") as file:
    for line in file:
        if 'Line' in line:
            count += 1

...使用highly optimized searching algorithm比使用正则表达式快得多(比我检查的速度快8倍)。

答案 1 :(得分:1)

您可以使用Multiline标志!

>>> s = """Test Line in File
... Test Line in File
... Test Line in File
... Test Line in File Line
... Test Line in File Line"""
>>> r = re.compile("^.*Line.*$", flags=re.MULTILINE)
>>> r.findall(s)
['Test Line in File',
 'Test Line in File',
 'Test Line in File',
 'Test Line in File Line',
 'Test Line in File Line']

但是,在这种情况下,我不鼓励使用正则表达式!

答案 2 :(得分:0)

没有必要将整个文件加载到内存中来执行re.findall,这样就会在找到第一个匹配时失去短路的能力

import re
with open('data.txt') as f:
    print sum(1 if re.search(r"Line", line) else 0 for line in f)  

5