Question

我有这个示例XML代码

<pathway>
    <relation entry1="62" entry2="64" type="PPrel">
        <subtype name="activation" value="--&gt;"/>
    </relation>
    <relation entry1="54" entry2="55" type="PPrel">
        <subtype name="activation" value="--&gt;"/>
        <subtype name="phosphorylation" value="+p"/>
    </relation>
    <relation entry1="55" entry2="82" type="PPrel">
        <subtype name="activation" value="--&gt;"/>
        <subtype name="phosphorylation" value="+p"/>
    </relation>
</pathway>

我正在尝试将子类型排序到列表中，但如果条目有多个子类型，则将它们组合成一个字符串

示例输出： ['激活'，'激活;磷酸化'，'激活;磷酸化']

我目前的代码是

tree= ET.parse('file.xml')
root= tree.getroot()
relation = []
for son in root:
    for step_son in son:
        if len(son.getchildren()) > 1:
            relation.append(step_son.get('name'))
        if len(son.getchildren()) < 2:
            relation.append(step_son.get('name'))

我的关系输出是：

['激活'，'激活'，'磷酸化'，'激活'，磷酸化']

任何帮助都会很棒，谢谢！

Answer 1

使用find和iterating每个匹配元素：

In [35]: from xml.etree import ElementTree
In [36]: xml_string = """
    ...: <pathway>
    ...:     <relation entry1="62" entry2="64" type="PPrel">
    ...:         <subtype name="activation" value="--&gt;"/>
    ...:     </relation>
    ...:     <relation entry1="54" entry2="55" type="PPrel">
    ...:         <subtype name="activation" value="--&gt;"/>
    ...:         <subtype name="phosphorylation" value="+p"/>
    ...:     </relation>
    ...:     <relation entry1="55" entry2="82" type="PPrel">
    ...:         <subtype name="activation" value="--&gt;"/>
    ...:         <subtype name="phosphorylation" value="+p"/>
    ...:     </relation>
    ...: </pathway>"""

In [37]: p_element = ElementTree.fromstring(xml_string)

In [38]: result = []

In [39]: for relation in p_element.findall('.//relation'):
    ...:    result.append(';'.join(x.attrib['name'] for x in relation.findall('.//subtype')))
    ...:

In [40]: result
Out[40]: ['activation', 'activation;phosphorylation', 'activation;phosphorylation']

如何制作子元素加入的列表（Python元素树）

1 个答案: