我有一个文本文件,它是几个XML放在一起的。目标是将文本拆分为多个XML。
我附带了这段代码
def split_file(filename):
"""
Split the input file into separate files, each containing a single patent.
As a hint - each patent declaration starts with the same line that was
causing the error found in the previous exercises.
"""
f = open(filename, 'r').read().split('\n')
last_header_line = 0
counter_q_of_files = 0
for line in enumerate(f) :
lines_ls = []
## add that line to an object that will be converted in xml file on the next condition
## code here
if line[1] == '<?xml version="1.0" encoding="UTF-8"?>':
## Make an xml file out of the previously created object, from lines_ls[last_header_line:line[0]]
## code here
last_header_line = line[0]
counter_q_of_files = counter_q_of_files+1
我可以创建字符串列表(每个未来XML行一个元素),并将该列表转换为XML文件吗?如果有,怎么样?
答案 0 :(得分:0)
如果你有大量的记忆,你可以分开'<?xml version="1.0" encoding="UTF-8"?>'
而不是'\n'
。
如果要累积行,请了解列表,尤其是append
方法。 join
字符串方法也很有用。您在代码中使用enumerate
并不正确,但我认为它不会有用。