将编辑过的xml内容写入另一个文件问题 - python

时间:2015-08-24 09:29:03

标签: python xml file

我有两个xml文件,如下所示,我想用文件A检查文件B的顺序(文件B应该遵循文件A的顺序)。我还编写了一个程序,下面是维护顺序的工作,唯一的问题是我无法正确地将输出写入另一个xml文件。在问这里之前,我确实研究了如何将编辑过的xml文件写回源或另一个文件,但也许我错过了一些非常小的东西。

档案A

<?xml version="1.0" encoding="UTF-8" standalone="no" ?>
<p1:sample1 xmlns:p1="http://www.example.org/eHorizon">
<p1:time nTimestamp="1">
   <p1:location hours = "1" path = '1'>       
      <p1:feature color="6" type="a">560</p1:feature>
   </p1:location>
</p1:time>
<p1:time nTimestamp="2">
   <p1:location hours = "1" path = '1'>
      <p1:feature color="2" type="a">564</p1:feature>         
   </p1:location>
</p1:time>
<p1:time nTimestamp="3">
   <p1:location hours = "1" path = '1'>       
      <p1:feature color="6" type="a">560</p1:feature>          
   </p1:location>
</p1:time>
</p1:sample1>

档案B

<?xml version="1.0" encoding="UTF-8" standalone="no" ?>
<p1:sample1 xmlns:p1="http://www.example.org/eHorizon">
<p1:time nTimestamp="1">
   <p1:location hours = "1" path = '1'>       
      <p1:feature color="6" type="a">560</p1:feature>     
   </p1:location>
</p1:time>
<p1:time nTimestamp="3">
   <p1:location hours = "1" path = '1'>
      <p1:feature color="6" type="a">560</p1:feature>     
   </p1:location>
</p1:time>
<p1:time nTimestamp="2">
   <p1:location hours = "1" path = '1'>       
      <p1:feature color="2" type="a">564</p1:feature>      
   </p1:location>
</p1:time>
</p1:sample1>

仅为了您的信息,这里唯一的区别是整个p1:time元素的顺序,由nTimestamps及其子元素locationfeature表示。您可以在文件A中看到它是1,2,3...,在文件B中它是1,3,2...(我说的是整个p1:time元素及其中的所有内容)

我想要什么

from lxml import etree
from collections import defaultdict
from distutils.filelist import findall
from lxml._elementpath import findtext



recovering_parser = etree.XMLParser(recover=True)

Reference = etree.parse("C:/Users/your_location/Desktop/sample1.xml", parser=recovering_parser)
Copy = etree.parse("C:/Users/your_location/Desktop/sample2.xml", parser=recovering_parser)


ReferenceTest = Reference.findall("{http://www.example.org/eHorizon}time") #find all time elements in sample1
CopyTest = Copy.findall("{http://www.example.org/eHorizon}time") #find all time elements in sample2

a=[] #list for storing sample1's Time elements
b=[] #list for storing sample2's Time elements
new_list=[] #for storing sorted data

for i,j in zip(ReferenceTest,CopyTest):

    a.append((i, i.attrib.get("nTimestamp"))) # store data in format [(<Element {http://www.example.org/eHorizon}time at 0x213d738>, '1')  
                                              # where 1,2 or 3 is ntimestamp attribute and corresponding parent 'time' element of that attribute
    b.append((j, j.attrib.get("nTimestamp"))) # same as above 

def sortTimestamps(a,b):   #use this function to sort elements in 'b' list in such a manner that they follow sequence of 'a' list 

    for i in a:
        for j in b:
            if i[1]==j[1]:
                s = a.index(i)
                t = b.index(j)
                b[t],b[s]=b[s],b[t]     



sortTimestamps(a, b)  # call sort function 

for i in b:
    new_list.append(i[0]) # store the sorted timestamps in new_list


CopyTest = new_list # assign new sorted list of time elements to old list

Copy.write("C:/Users/your_location/Desktop/output_data.xml") # write data to another file and check results 

上面是根据文件A的顺序执行排序B文件的工作的代码。但是当我将程序写入另一个文件时,它按原样写入文件B的数据。也就是说它以与上面文件B中所示相同的方式写入数据。 排序后我希望修改文件B的数据顺序,它应该按照文件A

中给出的格式写入数据

我尝试了什么

除了上面的程序,我尝试阅读更多关于文件写作,但它让我无处可去。我检查了我的xml的格式,我相信这完全没问题。最后我还跟着教程here看看它是如何显示写作的,但这种方法也没有用。也许你们可以帮助我。

编辑:我从链接中删除了代码并将其添加到此处。我之前做过它以防止长篇帖子

0 个答案:

没有答案