使用python从ENA存档解析xml文件

时间:2017-04-23 10:20:23

标签: xml-parsing python-2.6

这是来自ENA的xml文件的一部分,包括几个ROOT

<?xml version="1.0" encoding="UTF-8"?>
<ROOT request="Taxon:5671&amp;display=xml">
<taxon scientificName="Leishmania infantum" taxId="5671" parentTaxId="38574"       rank="species" hidden="true" taxonomicDivision="INV" geneticCode="1"   mitochondrialGeneticCode="4" plastIdGeneticCode="11">
        <lineage>
            <taxon scientificName="Leishmania donovani species complex" taxId="38574" rank="species group" hidden="true"></taxon>
            <taxon scientificName="Leishmania" taxId="38568" rank="subgenus" hidden="true"></taxon>
            <taxon scientificName="Leishmania" taxId="5658" rank="genus" hidden="false"></taxon>
            <taxon scientificName="Leishmaniinae" taxId="1286322" rank="subfamily" hidden="false"></taxon>
            <taxon scientificName="Trypanosomatidae" taxId="5654" rank="family" hidden="false"></taxon>
            <taxon scientificName="Kinetoplastida" commonName="kinetoplasts" taxId="5653" rank="order" hidden="false"></taxon>
            <taxon scientificName="Euglenozoa" taxId="33682" hidden="false"></taxon>
            <taxon scientificName="Eukaryota" commonName="eucaryotes" taxId="2759" rank="superkingdom" hidden="false"></taxon>
            <taxon scientificName="cellular organisms" taxId="131567" hidden="true"></taxon>
            <taxon scientificName="root" taxId="1" hidden="true"></taxon>
        </lineage>
        <children>
            <taxon scientificName="Leishmania infantum JPCM5" taxId="435258">    </taxon>
        </children>
        <synonym type="synonym" name="Leishmania (Leishmania) infantum"></synonym>
        <synonym type="synonym" name="Leishmania donovani infantum"></synonym>
    </taxon>
</ROOT>

我在python 2.6中解析它如下:

import xml.etree.ElementTree as ET
tree = ET.parse('parsing_ena.xml')

我可以使用

获取与第一个孩子相关的所有分类名称
root = tree.getroot()
taxa = root.findall("./ROOT/taxon")
first_taxa = [x.attrib["scientificName"] for x in taxa[1].findall("./lineage/taxon")]

如何在xml文件中迭代所有子项?

0 个答案:

没有答案