说我在testFile文件中有以下几行:
Test Line in File
Test Line in File
Test Line in File
Test Line in File Line
Test Line in File Line
是否可以执行re.findall(),这将允许我找到一个'模式'的实例。每行?例如,如果我执行len(re.findall(" Line",testfile,0)),程序将返回7.我希望它返回5.我想的是" Line。* \ n",但这仍然会返回7.只是为了澄清,我想避免使用:
count = 0
with open(testFile, "r") as file:
for line in file:
re.match(pattern, testFile, 0)
#etc
感谢任何帮助。
答案 0 :(得分:1)
对于这样一个简单的匹配,使用它更有效...
count = 0
with open(testFile, "r") as file:
for line in file:
if 'Line' in line:
count += 1
...使用highly optimized searching algorithm比使用正则表达式快得多(比我检查的速度快8倍)。
答案 1 :(得分:1)
您可以使用Multiline标志!
>>> s = """Test Line in File
... Test Line in File
... Test Line in File
... Test Line in File Line
... Test Line in File Line"""
>>> r = re.compile("^.*Line.*$", flags=re.MULTILINE)
>>> r.findall(s)
['Test Line in File',
'Test Line in File',
'Test Line in File',
'Test Line in File Line',
'Test Line in File Line']
但是,在这种情况下,我不鼓励使用正则表达式!
答案 2 :(得分:0)
没有必要将整个文件加载到内存中来执行re.findall
,这样就会在找到第一个匹配时失去短路的能力
import re
with open('data.txt') as f:
print sum(1 if re.search(r"Line", line) else 0 for line in f)
5