我有一个包含多行的文本文件。我想在两行(++起始行和 - 退出行)之间检查特定行(调用xyz ...)。如果(调用xyz ...)行存在,那么它应该返回该行,如果不存在行,则应该返回NULL值。我想将结果存储到列表中。
示例文件:
++ start line
22 15:36:53
dog, cat, monkey, rat
calling xxxxx
animal already added
-- exiting line
上面的行块应添加调用xxxxx 列表。
++ start line
12 12:56:34
cat, camel, cow, dog
animal already added
-- exiting line
在上面的块调用中,xyz缺失,因此它应该将NULL添加到列表
预期产出
calling xxxxx
NULL
答案 0 :(得分:0)
您可以使用此正则表达式检查您提到的情况:
^\+\+(?=(?:(?!\-\-).)*\s+(calling[^\n]+)).*?\s+--
Observe how the regex works here
如果匹配,则将主叫行作为组1
示例来源(run here):
import re
regex = r"(?:^\+\+(?=(?:(?!\-\-).)*\s+(calling[^\n]+)).*?\s+--)|(?:^\+\+(?=(?:(?!\-\-).)*\s+(?!calling[^\n]+)).*?\s+--)"
test_str = ("++ start line \n"
"22 15:36:53 \n"
"dog, cat, monkey, rat\n"
"calling xxxxx\n"
"animal already added\n"
"-- exiting line\n\n\n"
"++ start line \n"
"12 12:56:34 \n"
"cat, camel, cow, dog \n"
"animal already added\n"
"-- exiting line\n\n"
"++ start line \n"
"12 12:56:34 \n"
"cat, camel, cow, dog \n"
"calling pqr \n"
"animal already added\n"
"-- exiting line\n\n")
matches = re.finditer(regex, test_str, re.DOTALL | re.MULTILINE)
for match in matches:
print(match.group(1))
输出:
calling xxxxx
None
calling pqr
答案 1 :(得分:0)
您可能希望使用多个模式,一个用于分隔块,另一个用于块中的搜索calling...
。
块的表达式(参见a demo here):
^\+\+
(?P<block>[\s\S]+?)
^--.+
calling...
的表达式:
^calling.+
<小时/> 作为
Python
摘要:
import re
rx_block = re.compile(r'''
^\+\+
(?P<block>[\s\S]+?)
^--.+''', re.MULTILINE | re.VERBOSE)
rx_calling = re.compile(r'''
^calling.+
''', re.MULTILINE | re.VERBOSE)
numbers = [number.group(0) if number else None
for block in rx_block.finditer(your_string_here)
for number in [rx_calling.search(block.group('block'))]]
print(numbers)
哪个收益
['calling xxxxx', None]
答案 2 :(得分:0)
可以使用拆分功能获取子部件并检查它们:
outlist = []
with open("calling.txt", "r") as ff:
lines = ff.read()
records = lines.split("++ start line ")
records = list(filter(lambda x: len(x)>0, records))
for rec in records:
found = False
rows = rec.split("\n")
for row in rows:
if not found and row.startswith("calling"):
outlist.append(row.split(" ")[1])
found = True
if not found:
outlist.append("NULL")
print(outlist)
输出:
['xxxxx', 'NULL', 'pqr']