Question

尝试使用ElementTree解析XML。我无法弄清楚如何处理<tag/>之类的空标签。如果标记根本不存在，则.find()返回None，一切正常。但是，对于<tag\>，.find()返回了一些信息，因此随后调用text的尝试失败并显示错误：

TypeError: must be str, not NoneType

下面的失败示例。它将无法解析行<tl><mpa/></tl>

from xml.etree import ElementTree

def getStuff(xml_message):
    message_tree = ElementTree.fromstring(xml_message)
    ns = {'a': 'http://www.example.org/a',
          'b': 'http://www.example.org/b'}          
    tls = message_tree.findall('.//b:tl', namespaces = ns)

    result, i = (0,)*2

    for tl in tls:
        i += 1     
        print("Item: " + str(i))
        mpa = tl.find("b:mpa", namespaces = ns)
        if mpa is None:
            result = result + 0
            print(" |--> Is None, assigned 0.")
        else:
            print(" |--> Is Something")
            # This is where things go terribly wrong
            print(" |--> Tag Value: " + mpa.text)
            result = result  + int(mpa.text)    
    return result

instr = """<?xml version="1.0" standalone='no'?>
<ncr xmlns="http://www.example.org/a">
  <x xmlns="http://www.example.org/b">
      <tl><ec code="N">e1</ec></tl>
      <tl><mpa>0010</mpa></tl>
      <tl><mpa/></tl>
  </x>
</ncr>
"""
getStuff(instr)

Answer 1

使用空标签<mpa/>，您的mpa变量是有效节点，因此不是None，但是mpa.text是None，因为没有文本内。由于串联仅适用于两个字符串，因此您尝试将字符串" |--> Tag Value: "连接到None的尝试失败。相反，您可以使用格式运算符将None格式化为'None'，并在以下行中添加条件，以避免将mpa.text转换为整数，如果它是None：

print(" |--> Tag Value: %s" % mpa.text)
if mpa.text is not None:
    result = result  + int(mpa.text)

进行上述更改后，输出将变为：

Item: 1
 |--> Is None, assigned 0.
Item: 2
 |--> Is Something
 |--> Tag Value: 0010
Item: 3
 |--> Is Something
 |--> Tag Value: None

用ElementTree解析空标签

1 个答案: