有没有办法扫描整个文本文档并说找到“lol”的所有内容并将其替换为第一个上一章标记的id值?也许是这样的。
蟒
x=open('source.txt')
lines = x.readlines()
for line in lines:
if line.startswith('<text'):
line.replace('lol', first previous chapter id value)
x.write(lines)
x.close()
源文字
<chapter id="1">
<text class="lol">
<text class="lol">
<chapter id="2">
<text class="lol">
<text class="lol">
<chapter id="3">
<text class="lol">
<text class="lol">
<chapter id="4">
<text class="lol">
<text class="lol">
结果文字
<chapter id="1">
<text class="1">
<text class="1">
<chapter id="2">
<text class="2">
<text class="2">
<chapter id="3">
<text class="3">
<text class="3">
<chapter id="4">
<text class="4">
<text class="4">
答案 0 :(得分:3)
试试吧。基本上你需要额外做的就是找到章节id。另外我假设你知道要写入文件,因此为什么我只打印每一行。
import re
with open('source.txt') as x:
for line in x:
if line.startswith('<chapter'):
id = re.findall('"([^"]*)"', line) #Grabs string between matching quotations
if line.startswith('<text'):
line = line.replace('lol',id[0])
print line[:-1]
输出:
<chapter id="1">
<text class="1">
<text class="1">
<chapter id="2">
<text class="2">
<text class="2">
<chapter id="3">
<text class="3">
<text class="3">
<chapter id="4">
<text class="4">
<text class="4">