从xml中检索数据。

时间:2016-07-17 10:56:49

标签: python xml python-3.x xmltodict

我有一个xml文件,我必须通过它检索xml文档。 下面是我的xml文档。

-<orcid-message>
   -<orcid-profile type="user">
      -<orcid-activities>
         -<orcid-works>
            -<orcid-work put-code="23938140" visibility="public">
               -<work-contributors>
                  -<contributor>
                       -<credit-name visibility="public">Tania Maes</credit-name>
                  -<contributor>
                       -<credit-name visibility="public">Francisco Avila Cobos</credit-name>
                  -<contributor>
                       -<credit-name visibility="public">Franco Liala Manus</credit-name>

我想检索贡献者名称: 到目前为止我已尝试过:

contributors_name = (doc['orcid-message']['orcid-profile']
                        ['orcid-activities']['orcid-works']
                        ['orcid-work']['work-contributors']
                        ['contributor']['credit-name']  )

print(contributors_name)

请告诉我哪里出错了。谢谢。

1 个答案:

答案 0 :(得分:0)

  

TypeError: list indices must be integers, not str:我收到此错误”

错误消息表明问题是由于XML包含多个contributor元素,因此您的代码最多['contributor']部分将返回一个列表,而该列表又无法通过key(即['credit-name'])就像字典一样。您需要从列表中选择一个您想要获取credit-name的项目,例如从第一项中选择:

contributors = doc['orcid-message']['orcid-profile'] \
    ['orcid-activities']['orcid-works'] \
    ['orcid-work']['work-contributors'] \
    ['contributor']
contributor_name = contributors[0]['credit-name']

或者您可以使用列表推导从所有贡献者那里获取credit-name

contributors_name = [contrib['credit-name']['#text'] for contrib in contributors]
print(contributors_name)

输出

[u'Tania Maes', u'Francisco Avila Cobos', u'Franco Liala Manus']