我有以下XML文档,我想写入csv文件。
<items>
<item>
<attribute type="set" identifier="naadloos">
<name locale="nl_NL">Naadloos</name>
<value locale="nl_NL" identifier="nee">Nee</value>
</attribute>
<attribute type="asset" identifier="short_description">
<value locale="nl_NL">Tom beugel bh</value>
</attribute>
<attribute type="text" identifier="name">
<name locale="nl_NL">Naam</name>
<value>Marie Jo L'Aventure Tom beugel bh</value>
</attribute>
<attribute type="int" identifier="is_backorder">
<name locale="nl_NL">Backorder</name>
<value>2</value>
</attribute>
</item>
</items>
如何从此格式检索数据?我需要以下输出
naadloos, short_description, name, is_Backorder
Nee, Tom beugel bh, Marie Jo L'Adventure Tom beugel bh, 2
所以我需要属性行中的标识符和值行中的文本。
有什么想法吗?
非常感谢
答案 0 :(得分:0)
这是我elements
attribute
的{{1}}尝试,并dictwriter
将其写入指定的文件!
import lxml.etree as et
import csv
#headers={}
xml= """<items>
<item>
<attribute type="set" identifier="naadloos">
<name locale="nl_NL">Naadloos</name>
<value locale="nl_NL" identifier="nee">Nee</value>
</attribute>
<attribute type="asset" identifier="short_description">
<value locale="nl_NL">Tom beugel bh</value>
</attribute>
<attribute type="text" identifier="name">
<name locale="nl_NL">Naam</name>
<value>Marie Jo L'Aventure Tom beugel bh</value>
</attribute>
<attribute type="int" identifier="is_backorder">
<name locale="nl_NL">Backorder</name>
<value>2</value>
</attribute>
</item>
</items>
"""
tree = et.fromstring(xml)
header = []
for i in tree.xpath("//attribute/@identifier"):
header.append(i)
def dicter(x):
exp = r"//attribute[@identifier='%s']/value/text()"%x
tmp = ''.join(tree.xpath(exp))
d = [x,tmp]
return d
data = dict(dicter(i) for i in header)
#Now write data into file
with open(r"C:\Users\User_Name\Desktop\output.txt",'wb') as wrt:
writer = csv.DictWriter(wrt,header)
writer.writeheader()
writer.writerow(data)
书面文件内容 -
naadloos,short_description,name,is_backorder
Nee,Tom beugel bh,Marie Jo L'Aventure Tom beugel bh,2