Question

我当前正在尝试解析XML文件，下面的代码可以正常工作，除了一个问题。在我创建的full_names列表的情况下，这是唯一不总是出现在每个子树中的标记，即user_names，profiles和authentication类型都返回说30个项目，而full_name返回27。

因此，当我尝试创建dataframe时，出现错误，因为列表的长度不同。

xmltree = ET.parse(xml)
namespaces = {'ns5':'urn:swift:saa:xsd:operator',
              'profile':'urn:swift:saa:xsd:operatorprofile'
             }

user_names = xmltree.xpath('//ns5:Identifier/ns5:Name/text()', namespaces=namespaces)
full_names = xmltree.xpath('//ns5:Description/text()', namespaces=namespaces)
profiles = xmltree.xpath('//profile:Name/text()', namespaces=namespaces)
authentication_types = xmltree.xpath('//ns5:AuthenticationType/text()', namespaces=namespaces)

xml = pd.DataFrame({'User_Name': user_names, 'Full_Name': full_names,
                    'Profile': profiles, 'Authentication_Type': authentication_types})

有没有一种方法可以使dataframe可以在全名列中使用空值（或空白）来创建？

Answer 1

尝试一下，没有对您的代码进行测试，导致复制不充分

full_names = xmltree.xpath('//ns5:Description/text()', namespaces=namespaces) if xmltree.xpath('//ns5:Description/text()', namespaces=namespaces) else ""

或多行中的相同代码

full_names = xmltree.xpath('//ns5:Description/text()', namespaces=namespaces)
if not full_name:
    full_names = ""

Answer 2

我昨晚在想这件事，意识到我需要识别父标记并循环遍历以寻找子标记。改变了我的方法，它现在以我想要的方式工作。

对于任何遇到问题的人，这是我使用的解决方法。

root = xmltree.getroot()

user_names = []
full_names = []
profiles = []
authentication_types = []

for i in root.findall('.//{urn:swift:saa:xsd:impex:operator}Operator'):
    usernames = i.find('.//{urn:swift:saa:xsd:operator}Name')
    usernames2 = "" if usernames is None else usernames.text
    user_names.append(usernames2)

    fullnames = i.find('{urn:swift:saa:xsd:operator}Description')
    fullnames2 = "" if fullnames is None else fullnames.text
    full_names.append(fullnames2)

    user_profiles = i.find('.//{urn:swift:saa:xsd:operatorprofile}Name')
    user_profiles2 = "" if user_profiles is None else user_profiles.text
    profiles.append(user_profiles2)

    authenications = i.find('{urn:swift:saa:xsd:operator}AuthenticationType')
    authenications2 = "" if authenications is None else authenications.text
    authentication_types.append(authenications2)

    xml = pd.DataFrame({'User_Name': user_names,
                        'Full_Name': full_names,
                        'Profile': profiles,
                        'Authentication_Type': authentication_types})

XML xpath-如果找不到元素则传递值

2 个答案: