TypeError:list indices必须是整数,而不是str与xmltodict:

时间:2015-07-31 12:06:32

标签: python xmltodict

我有这个XML文件:

<?xml version="1.0"?>
<toolbox tool_path="/galaxy/main/shed_tools">
<section id="snpeff" name="snpEff" version="">
  <tool file="toolshed.g2.bx.psu.edu/repos/pcingola/snpeff/c052639fa666/snpeff/snpEff_2_1a/snpEff_2_1a/galaxy/snpSift_filter.xml" guid="toolshed.g2.bx.psu.edu/repos/pcingola/snpeff/snpSift_filter/1.0">
      <tool_shed>toolshed.g2.bx.psu.edu</tool_shed>
        <repository_name>snpeff</repository_name>
        <repository_owner>pcingola</repository_owner>
        <installed_changeset_revision>c052639fa666</installed_changeset_revision>
        <id>toolshed.g2.bx.psu.edu/repos/pcingola/snpeff/snpSift_filter/1.0</id>
        <version>1.0</version>
    </tool>
    <tool file="toolshed.g2.bx.psu.edu/repos/pcingola/snpeff/c052639fa666/snpeff/snpEff_2_1a/snpEff_2_1a/galaxy/snpEff.xml" guid="toolshed.g2.bx.psu.edu/repos/pcingola/snpeff/snpEff/1.0">
      <tool_shed>toolshed.g2.bx.psu.edu</tool_shed>
        <repository_name>snpeff</repository_name>
        <repository_owner>pcingola</repository_owner>
        <installed_changeset_revision>c052639fa666</installed_changeset_revision>
        <id>toolshed.g2.bx.psu.edu/repos/pcingola/snpeff/snpEff/1.0</id>
        <version>1.0</version>
    </tool>
    <tool file="toolshed.g2.bx.psu.edu/repos/gregory-minevich/check_snpeff_candidates/22c8c4f8d11c/check_snpeff_candidates/checkSnpEffCandidates.xml" guid="toolshed.g2.bx.psu.edu/repos/gregory-minevich/check_snpeff_candidates/check_snpeff_candidates/1.0.0">
      <tool_shed>toolshed.g2.bx.psu.edu</tool_shed>
        <repository_name>check_snpeff_candidates</repository_name>
        <repository_owner>gregory-minevich</repository_owner>
        <installed_changeset_revision>22c8c4f8d11c</installed_changeset_revision>
        <id>toolshed.g2.bx.psu.edu/repos/gregory-minevich/check_snpeff_candidates/check_snpeff_candidates/1.0.0</id>
        <version>1.0.0</version>
    </tool>
</section>
...

我试图通过以下方式解析上面的文件:

import xmltodict

# wget -c https://raw.githubusercontent.com/galaxyproject/usegalaxy-playbook/c55aa042825fe02ef4a02d958eb811adba8ea45f/files/galaxy/usegalaxy.org/var/shed_tool_conf.xml

if __name__ == '__main__':

    with open('tests/shed_tool_conf.xml') as fd:
        doc = xmltodict.parse(fd.read())
        tools_section = doc['toolbox']['section']['@name']
        print tools_section

但是,我遇到以下错误:

Traceback (most recent call last):
  File "importTools2Galaxy.py", line 15, in <module>
    tools_section = doc['toolbox']['section']['@name']
TypeError: list indices must be integers, not str

我做错了什么?

2 个答案:

答案 0 :(得分:3)

这是因为doc['toolbox']['section']会返回list个部分,因此您需要遍历每个部分以获得@name值。您可能需要检查给定部分中是否@name。为此,您可能希望使用.get而不是['@name']

with open('tests/shed_tool_conf.xml') as fd:
        doc = xmltodict.parse(fd.read())
        for section in doc['toolbox']['section']:
            tools_section = section.get('@name')
        print tools_section

答案 1 :(得分:2)

您的XML有许多section元素,您应该执行类似

的操作
tools_section = doc['toolbox']['section'][0]

其中0是您要阅读的部分的索引。如果索引未修复,您可以像for section in doc['toolbox']['section']: ...一样迭代它们,并在内容符合您条件的部分停止...或者只是对每个部分执行某些操作。