我有一个具有以下结构的文件:
******
Block 1
text
text
...
End
******
Block 2
text
text
...
End
******
Block 3
text
text
...
End
******
,依此类推。我想打开文件读取每一行,并将第一个块的信息保存在字符串中。这就是我到目前为止所拥有的。
Block = ''
with open(File) as file:
for line in file:
if re.match('\.Block.*', line):
Block += line
if 'str' in line:
break
print (Block)
但是,当我打印“块”时,我得到:
Block 1
Block 2
...
如何使用正则表达式将代码行从Block 1复制到End? 谢谢
答案 0 :(得分:1)
您可以使用itertools.groupby
:
import itertools, re
lines = [i.strip('\n') for i in open('filename.txt')]
first_result, *_ = [list(b) for a, b in itertools.groupby(lines, key=lambda x:bool(re.findall('^\*+$', x))) if not a]
print(first_result)
输出:
['Block 1', 'text', 'text', '...', 'End ']
答案 1 :(得分:0)
您仅在匹配正则表达式'.Block。*'的行上进行匹配。如果要分配每个块中的值,则必须做更多的工作。
Block = ''
Match = False
with open(File) as file:
for line in file:
if re.match('^End$', line):
Match = False
if re.match('\.Block.*', line) or Match:
Match = True
Block += line
if 'str' in line:
break
print (Block)
答案 2 :(得分:0)
with open(File) as ff:
txt=ff.read() # reading the whole file in
re.findall(r"(?ms)^\s*Block\s*\d+.*?^\s*End\s*$",txt)
Out:
['Block 1\ntext\ntext\n...\nEnd ',
'Block 2\ntext\ntext\n...\nEnd ',
'Block 3\ntext\ntext\n...\nEnd ']
Or change '\d+' to '1' to get the 1st one.
(?ms): m: multiline mode, that we can apply ^ and $ in each line,
s: '.' matches newline,too.
?: non-greedy mode in '.*?'