使用lxml生成xml文档,并根据逻辑改变元素文本和属性

时间:2014-02-17 17:18:00

标签: lxml

我有像这样的lxml代码

from lxml import etree
import sys
fd = open('D:\\text.xml', 'wb')
xmlns = "http://www.fpml.org/FpML-5/confirmation"
xsi  = "http://www.w3.org/2001/XMLSchema-instance"
fpmlVersion="http://www.fpml.org/FpML-5/confirmation ../../fpml-main-5-6.xsd http://www.w3.org/2000/09/xmldsig# ../../xmldsig-core-schema.xsd"
page = etree.Element("{"+xmlns+"}dataDocument",nsmap={None:xmlns,'xsi':xsi })
doc = etree.ElementTree(page)
page.set("fpmlVersion", fpmlVersion)
trade = etree.SubElement(page,'trade')
tradeheader = etree.SubElement(trade,'tradeheader')
partyTradeIdentifier = etree.SubElement(tradeheader,'partyTradeIdentifier')
partyReference = etree.SubElement(partyTradeIdentifier,'partyReference',href='party1')
tradeId = etree.SubElement(partyTradeIdentifier,'tradeId',tradeIdScheme='http://www.partyA.com/swaps/trade-id')
tradeId.text = 'TW9235'
swap = etree.SubElement(trade,'swap')
party = etree.SubElement(page,'party',id='party1')
partyID = etree.SubElement(party,'partyID')
partyID.text = 'PARTYAUS33'
partyName = etree.SubElement(party,'partyName')
partyName.text = 'Party A'
party = etree.SubElement(page,'party',id='party2')
partyID = etree.SubElement(party,'partyID')
partyID.text = 'BARCGB2L'
partyName = etree.SubElement(party,'partyName')
partyName.text = 'Party B'
s = etree.tostring(doc, xml_declaration=True,encoding="UTF-8",pretty_print=True)
print (s)
fd.write(s)

我需要生成像

这样的xml文件
<?xml version='1.0' encoding='UTF-8'?>
<dataDocument xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns="http://www.fpml.org/FpML-5/confirmation" fpmlVersion="http://www.fpml.org/FpML-5/confirmation ../../fpml-main-5-6.xsd http://www.w3.org/2000/09/xmldsig# ../../xmldsig-core-schema.xsd">
  <trade>
    <tradeheader>
      <partyTradeIdentifier>
        <partyReference href="party1"/>
        <tradeId tradeIdScheme="http://www.partyA.com/swaps/trade-id">TW9235</tradeId>
      </partyTradeIdentifier>
    </tradeheader>
    <swap/>
  </trade>
  <party id="party1">
    <partyID>PARTYAUS33</partyID>
    <partyName>Party A</partyName>
  </party>
  <party id="party2">
    <partyID>BARCGB2L</partyID>
    <partyName>Party B</partyName>
  </party>
</dataDocument>

现在上面的代码工作了。 但是我需要生成10k这样的文件,其中元素文本或属性各不相同。 例如,partyID可能不同 PARTYGER45而不是PARTYUS33是否有一种干净的方式来做这个而不是硬编码呢? 同样,我需要改变很多东西,比如tradeId TW9235

1 个答案:

答案 0 :(得分:1)

一种方法是将输出xml没有加载到lxml objectify的值,然后在设置相关值时循环并将其写入文件,这意味着

from lxml import objectify
with open('in.xml') as f_in:
   for pId in ['PARTYGER45', ...]:
        dataDocument = objectify.parse(f.read())
        dataDocument.party.partyID._setText(pId)
        ...
        obj_xml = lxml.etree.tostring(dataDocument)
        with open('out_%s.xml' % pId, 'w') as f_out:
            f.write(obj_xml)

另一种方式可能是使用lxml and xslt,再次从空的结构化xml开始,并根据您的需要转换结构。