我正在尝试从API调用中获取数据,该API调用返回XML对象并将少量数据点解析为csv文件,每个对象都在其自己的列中。
XML看起来像这样:
<?xml version="1.0" encoding="utf-8" ?>
<YourMembership_Response>
<Items>
<Item>
<ItemID></ItemID>
<ID>92304823A-2932</ID>
<WebsiteID>0987</WebsiteID>
<NamePrefix></NamePrefix>
<FirstName>John</FirstName>
<MiddleName></MiddleName>
<LastName>Smith</LastName>
<Suffix></Suffix>
<Nickname></Nickname>
<EmployerName>abc company</EmployerName>
<WorkTitle>manager</WorkTitle>
<Date>3/14/2013 2:12:39 PM</Date>
<Description>Removed from group by Administration.</Description>
</Item>
<Item>
<ItemID></ItemID>
<ID>92304823A-2932</ID>
<WebsiteID>0987</WebsiteID>
<NamePrefix></NamePrefix>
<FirstName>John</FirstName>
<MiddleName></MiddleName>
<LastName>Smith</LastName>
<Suffix></Suffix>
<Nickname></Nickname>
<EmployerName>abc company</EmployerName>
<WorkTitle>manager</WorkTitle>
<Date>3/14/2013 2:12:39 PM</Date>
<Description>Removed from group by Administration.</Description>
</Item>
我已经编写了这段代码,只将ID写入CSV,工作正常。
with open("output1.csv", "wb") as f:
writer = csv.writer(f)
for node in tree.findall('.//ID'):
writer.writerow([node.text])
现在,当我尝试将多个数据点写入csv时,机器只是将数据点附加到一列中。这是我一直尝试的代码:
with open("test1.csv", "wb") as f:
writer = csv.writer(f)
for node in tree.findall('.//ID'):
writer.writerow([node.text])
for node in tree.findall('.//FirstName'):
writer.writerow([node.text])
for node in tree.findall('.//LastName'):
writer.writerow([node.text])
我需要在csv中看到这样的数据,以及稍后选择的其他数据点,我做错了什么?:
ID FirstName LastName
92304823A-2932 John Smith
提前谢谢。
答案 0 :(得分:1)
这实质上就是如何收集数据。
>>> from xml.etree import ElementTree
>>> tree = ElementTree.parse('api.xml')
>>> tree.findall('.//Item')
[<Element 'Item' at 0x0000000006679EA8>, <Element 'Item' at 0x0000000006681318>]
>>> for item in tree.findall('.//Item'):
... item.find('ID').text, item.find('FirstName').text, item.find('LastName').text
...
('92304823A-2932', 'John', 'Smith')
('92304823A-2932', 'John', 'Smith')
相比之下,当您使用像tree.findall('.//ID')
这样的构造时,您要求xpath引擎以tree
开头(那是&#39;。&#39;部分)向下看看所有出现的ID&#39; ID&#39; 马上。这意味着,在您的xml代码示例中,您将获得两个ID的 set ,这些ID甚至不一定是原始顺序。您需要做的是,首先找到所有Item
条目,然后找到感兴趣的三个相应数据片段Item
。
附录:
>>> import csv
>>> with open('api.csv', 'w', newline='') as csvfile:
... fieldnames = ['ID', 'FirstName', 'LastName']
... writer = csv.DictWriter(csvfile, fieldnames=fieldnames)
... writer.writeheader()
... for item in tree.findall('.//Item'):
... writer.writerow({
... 'ID': item.find('ID').text,
... 'FirstName': item.find('FirstName').text,
... 'LastName': item.find('LastName').text})
产生的输出文件:
ID,FirstName,LastName
92304823A-2932,John,Smith
92304823A-2932,John,Smith