尝试使用ElementTree解析XML。我无法弄清楚如何处理<tag/>
之类的空标签。如果标记根本不存在,则.find()
返回None
,一切正常。但是,对于<tag\>
,.find()
返回了一些信息,因此随后调用text
的尝试失败并显示错误:
TypeError: must be str, not NoneType
下面的失败示例。它将无法解析行<tl><mpa/></tl>
from xml.etree import ElementTree
def getStuff(xml_message):
message_tree = ElementTree.fromstring(xml_message)
ns = {'a': 'http://www.example.org/a',
'b': 'http://www.example.org/b'}
tls = message_tree.findall('.//b:tl', namespaces = ns)
result, i = (0,)*2
for tl in tls:
i += 1
print("Item: " + str(i))
mpa = tl.find("b:mpa", namespaces = ns)
if mpa is None:
result = result + 0
print(" |--> Is None, assigned 0.")
else:
print(" |--> Is Something")
# This is where things go terribly wrong
print(" |--> Tag Value: " + mpa.text)
result = result + int(mpa.text)
return result
instr = """<?xml version="1.0" standalone='no'?>
<ncr xmlns="http://www.example.org/a">
<x xmlns="http://www.example.org/b">
<tl><ec code="N">e1</ec></tl>
<tl><mpa>0010</mpa></tl>
<tl><mpa/></tl>
</x>
</ncr>
"""
getStuff(instr)
答案 0 :(得分:1)
使用空标签<mpa/>
,您的mpa
变量是有效节点,因此不是None
,但是mpa.text
是None
,因为没有文本内。由于串联仅适用于两个字符串,因此您尝试将字符串" |--> Tag Value: "
连接到None
的尝试失败。相反,您可以使用格式运算符将None
格式化为'None'
,并在以下行中添加条件,以避免将mpa.text
转换为整数,如果它是None
:
print(" |--> Tag Value: %s" % mpa.text)
if mpa.text is not None:
result = result + int(mpa.text)
进行上述更改后,输出将变为:
Item: 1
|--> Is None, assigned 0.
Item: 2
|--> Is Something
|--> Tag Value: 0010
Item: 3
|--> Is Something
|--> Tag Value: None