我正在尝试将xml展平并写入csv,以便可以由etl进程使用它。
<Answers>
<AnswersList>
<Entry key="qs_location_name" type="System.String">
<value>Location Name</value>
</Entry>
<Entry key="qs_location_riskAddress1" type="System.String">
<value>Risk Address 1</value>
</Entry>
<Entry key="qs_location_riskAddress2" type="System.String">
<value>Risk Address 2</value>
</Entry>
</AnswersList>
</Answers>
我的代码如下
from lxml import etree
from io import StringIO
tree = etree.parse(StringIO(xml_file))
root = tree.getroot().tag
for node in tree.iter():
for child in node.getchildren():
if child.text:
if child.text.strip():
print("{}.{} = {}".format(root, ".".join(tree.getelementpath(child).split("/")), child.text.strip()))
上面的代码提供以下输出。
AustraliaBizPackProposal.Answers.AnswersList.Entry[1].value = Location Name
AustraliaBizPackProposal.Answers.AnswersList.Entry[2].value = Risk Address 1
AustraliaBizPackProposal.Answers.AnswersList.Entry[3].value = Risk Address 2
我的预期输出是如下生成的,请告知
AustraliaBizPackProposal.Answers.AnswersList.qs_location_name.value = Location Name
AustraliaBizPackProposal.Answers.AnswersList.qs_location_riskAddress1.value = Risk Address 1
AustraliaBizPackProposal.Answers.AnswersList.qs_location_riskAddress2.value = Risk Address 2
答案 0 :(得分:0)
此代码适用于此特定文件:
root = tree.getroot().tag
for node in tree.iter():
for child in node.getchildren():
if child.tag == 'Entry':
path = tree.getelementpath(child).split("/")[0]
key = child.attrib['key']
for val in child.getchildren():
try:
print("{}.{}.{}.{} = {}".format(root, path, key, val.tag, val.text.strip()))
except:
print("{}.{}.{}.{} = {}".format(root, path, key, val.tag, val.attrib['Text']))