我正在搜索一个函数,该函数的参数是Integer(Line),返回值将xml-Line赋予该Integer。
我有一个很大的XMl文件,我想将其减少为许多较小的文件。每个输出文件都有一个开始标记和结束标记
例如
输入文件:: Test.xml
输出文件:
Test1.xml Test2.xml Test3.xml Test4.xml
tree = etree.parse(file_name)
root = tree.getroot()
# Here i count the number of XMl Lines in my file
xml_lines = 0
for child in root:
xml_lines +=1
# Here i want to get the String of my XMl Line by giving the number
for i in range(counter,counter+number_of_each_file):
d.write(FUNCTION)
答案 0 :(得分:0)
我认为您应该更改将大XML文件拆分为较小XML文件的方法。 XML不在乎行。它关心元素。您的函数应获取大XML的根目录,dest_file_name_prefix和代表每个小XML文件中所需元素的数字。
类似的东西:
def split_xml(root,dest_file_name_prefix,num_of_elements):
""" Loop around the elements under to root and save a each collection of 'num_of_elements' to a file having a unique name """
root = tree.getroot()
elements = root.findall('.//element')
counter = 0
temp = []
for idx,element in enumerate(elements)
temp.append(element)
if idx > 0 and idx % num_of_elements == 0:
# save the elements to a 'small' file
counter += 1
file_name = '{}_{}'.format(dest_file_name_prefix,counter)
#TODO I assume you know how to save the elements from temp to a file
temp = []
大XML示例
<root>
<element id="0"></element>
<element id="1"></element>
<element id="2"></element>
...
<element id="10000"></element>
</root>