如何处理XML到CSV转换中丢失的元素?

时间:2019-01-18 00:58:21

标签: python xml csv

以下XML中的

"currentAddress"是可选元素。我的python代码在存在"currentAddress"元素的地方工作正常,如果缺少则出错。

XML:

<?xml version = '1.0' encoding = 'UTF-8'?>
<ns2:exportEmpData xmlns:ns2="http://webservice.example.com/">
<emplist>
  <empId>6029</empId>
  <fullName>Justin Clark</fullName>
  <currentAddress houseNumber="14" street="Lepanto" city="Barcelona"/>
</emplist>
<emplist>
  <empId>6078</empId>
  <fullName>Jose Domingo</fullName>
</emplist>
</ns2:exportEmpData>

我的Python代码:

import xml
import csv
import xml.etree.ElementTree as ET

tree = ET.parse('C:/emp/emplist.xml')
root = tree.getroot()

# open a file for writing

Emp_data = open('C:/emp/emplist.csv', 'wb')

# create the csv writer object

csvwriter = csv.writer(Emp_data)
emp_head = []

count = 0
for member in root.findall('emplist'):
emp_nodes = []
if count == 0:
    empId = member.find('empId').tag
    emp_head.append(empId)
    fullName = member.find('fullName').tag
    emp_head.append(fullName)
    currentAddress = member.find('currentAddress').tag
    emp_head.append(currentAddress)
    csvwriter.writerow(emp_head)
    count = count + 1

empId = member.find('empId').text
emp_nodes.append(empId)
fullName = member.find('fullName').text
emp_nodes.append(fullName)
currentAddress = member.find('currentAddress').attrib.get('city')
emp_nodes.append(currentAddress)
csvwriter.writerow(emp_nodes)
Emp_data.close()

错误消息:

AttributeError: 'NoneType' object has no attribute 'attrib'

如果"Unknown"元素不可用于员工,我想添加一个字符串(例如:"currentAddress")。

2 个答案:

答案 0 :(得分:0)

一种“ Python式”的处理方式是使用try / except,如下所示。当"currentAddress"既不存在也没有city属性时,这将处理这种情况。

请注意,我也删除了for中处理count变量的代码,因为这是不必要的-我没有理由以不同的方式处理第一个。但是,有必要,那么处理它的代码也需要做类似的事情。

import csv
import xml
import xml.etree.ElementTree as ET


xml_filename = 'emplist.xml'
csv_filename = 'emplist.csv'

tree = ET.parse(xml_filename)
root = tree.getroot()

with open(csv_filename, 'w', newline='') as Emp_data:
    csvwriter = csv.writer(Emp_data)

    emp_head = []
    for member in root.findall('emplist'):
        emp_nodes = []

        empId = member.find('empId').text
        emp_nodes.append(empId)
        fullName = member.find('fullName').text
        emp_nodes.append(fullName)

        try:
             currentAddress = member.find('currentAddress').attrib.get('city')
        except AttributeError:
            currentAddress = 'Unknown'

        emp_nodes.append(currentAddress)
        csvwriter.writerow(emp_nodes)

答案 1 :(得分:-1)

执行此操作的一种方法是首先查看member.find('currentAddress')是否返回None。如果是这样,只需将城市设为“未知”即可。如果不是,则使用address_tag.attrib.get('city')提取城市。您还可以检查一下“城市”是否是现有属性之一。

from xml.etree import ElementTree

myxml = """<?xml version = '1.0' encoding = 'UTF-8'?>
<ns2:exportEmpData xmlns:ns2="http://webservice.example.com/">
<emplist>
  <empId>6029</empId>
  <fullName>Justin Clark</fullName>
  <currentAddress houseNumber="14" street="Lepanto" city="Barcelona"/>
</emplist>
<emplist>
  <empId>6078</empId>
  <fullName>Jose Domingo</fullName>
</emplist>
</ns2:exportEmpData>
"""

tree = ElementTree.ElementTree(ElementTree.fromstring(myxml))

for member in tree.findall('emplist'):
    city = 'Unknown' # Default value if we don't find a city
    address_tag = member.find('currentAddress')
    if address_tag is not None:
        city = address_tag.attrib.get('city')
    print("City is %s" % city)