我正在尝试使用python修改文档文本。
这是文档文本,如下所示:
abcdefghijklmn
<entry colname="1" rowname="1">a</entry>
<entry morecols="5" morecolname="2" namest="2" nameend="7" rowname="1">a</entry>
<entry colname="1" morerows="9" morerowname="2">b</entry>
<entry morecols="5" morecolname="2" namest="2" nameend="7" rowname="2">b</entry>
<entry colname="1" morerows="9" morerowname="2">b</entry>
<morecols="4" morecolname="3" namest="3" nameend="7" morerows="2" morerowname="3">c</entry>
<entry colname="2" rowname="3">c</entry>
<entry colname="2" rowname="4">d</entry>
<entry morecols="1" morecolname="2" namest="2" nameend="3" morerows="2" morerowname="5">e</entry>
<entry colname="2" rowname="5">e</entry>
abcdefghijklmn
我想在包含最后TEST
(包括rowname="n"
)的句子的末尾添加morerowname="n"
所以这是我想要的结果
abcdefghijklmn
<entry colname="1" rowname="1">a</entry>
<entry morecols="5" morecolname="2" namest="2" nameend="7" rowname="1">a</entry>TEST
<entry colname="1" morerows="9" morerowname="2">b</entry>
<entry morecols="5" morecolname="2" namest="2" nameend="7" rowname="2">b</entry>
<entry colname="1" morerows="9" morerowname="2">b</entry>TEST
<morecols="4" morecolname="3" namest="3" nameend="7" morerows="2" morerowname="3">c</entry>
<entry colname="2" rowname="3">c</entry>TEST
<entry colname="2" rowname="4">d</entry>TEST
<entry morecols="1" morecolname="2" namest="2" nameend="3" morerows="2" morerowname="5">e</entry>
<entry colname="2" rowname="5">e</entry>TEST
abcdefghijklmn
这是我到目前为止正在尝试的代码,但是我不知道如何编写if选项
with open("C:\\TEST\\test_addrow.xml","r",encoding="utf-8") as f:
data = f.read()
result = list()
All_text = data.split("\n")
a = 1
find_text = 'rowname="{}".*'.format(a)
for t in All_text:
if re.search(find_text, data) :
re.findall(find_text, data)[-1]
result.append(t+"TEST")
a = a + 1
else:
result.append(t)
with open("C:\\TEST\\test_addrow.xml","w",encoding="utf-8") as f:
f.write("\n".join(result))
您能给我什么建议吗? 谢谢
答案 0 :(得分:1)
您可以尝试以下代码。
在这种情况下,使用在字符串上定义的
在线试用split()
方法也是一个不错的选择。正则表达式也很不错,就像您在代码中使用的一样。
import re
# Reading XML file
with open("C:\\TEST\\test_addrow.xml", "r", encoding='utf-8') as f:
lines = f.readlines()
last_num = "" # It is to store the value of rowname & morerowname attributes
last_index = 0 # It is to store the last index matched for line which has rowname and morerowname attibutes
opened = False # It is to track he first and last match found for sequence of same numbers
for i, line in enumerate(lines):
arr = re.findall(r"rowname=\"\d+", line)
arr2 = []
if arr:
arr2 = arr[0].split('"')
if arr2:
if last_num and last_num != arr2[1]:
lines[last_index] = lines[last_index].strip() + 'TEST' + '\n'
opened = False # Added TEST so close
else:
opened = True # Continue as the number is matched
last_index = i
last_num = arr2[1]
else:
if last_index:
lines[last_index] = lines[last_index].strip() + 'TEST' + '\n'
opened = False # Added TEST so close
# In cases like if the XML file only has 1 line
if opened:
lines[last_index] = lines[last_index].strip() + 'TEST' + '\n'
lines = "".join(lines)
# Writing modified lines to file
with open("C:\\TEST\\test_addrow.xml", "w", encoding='utf-8') as f:
f.write(lines)
答案 1 :(得分:1)
在这种情况下,您可以对要分割的内容进行一些操作...逻辑仍然相同:
输入:
$cat test_addrow.xml
<entry colname="1" rowname="1">a</entry>
<entry morecols="5" morecolname="2" namest="2" nameend="7" rowname="1">a</entry>
<entry colname="1" morerows="9" morerowname="2">b</entry>
<entry morecols="5" morecolname="2" namest="2" nameend="7" rowname="2">b</entry>
<entry colname="1" morerows="9" morerowname="2">b</entry>
<morecols="4" morecolname="3" namest="3" nameend="7" morerows="2" morerowname="3">c</entry>
<entry colname="2" rowname="3">c</entry>
<entry colname="2" rowname="4">d</entry>
<entry morecols="1" morecolname="2" namest="2" nameend="3" morerows="2" morerowname="5">e</entry>
<entry colname="2" rowname="5">e</entry>
代码:
with open('test_addrow.xml') as file:
lines = file.readlines()
with open('test_addrow.xml', 'w') as file1:
for i, line in enumerate(lines[:-1]):
current_n = int(line.split('rowname="')[-1].split('"')[0])
next_n = int(lines[i+1].split('rowname="')[-1].split('"')[0])
if next_n != current_n:
file1.write(line.strip() + "TEST\n")
else:
file1.write(line)
# Write the last line which always has TEST appended
file1.write(lines[-1].strip() + "TEST\n")
输出:
$cat test_addrow.xml
<entry colname="1" rowname="1">a</entry>
<entry morecols="5" morecolname="2" namest="2" nameend="7" rowname="1">a</entry>TEST
<entry colname="1" morerows="9" morerowname="2">b</entry>
<entry morecols="5" morecolname="2" namest="2" nameend="7" rowname="2">b</entry>
<entry colname="1" morerows="9" morerowname="2">b</entry>TEST
<morecols="4" morecolname="3" namest="3" nameend="7" morerows="2" morerowname="3">c</entry>
<entry colname="2" rowname="3">c</entry>TEST
<entry colname="2" rowname="4">d</entry>TEST
<entry morecols="1" morecolname="2" namest="2" nameend="3" morerows="2" morerowname="5">e</entry>
<entry colname="2" rowname="5">e</entry>TEST