输出

Question

我有以下xml：

    <?xml version="1.0" encoding="utf-8"?>
    <autostart version="2.0">
    <FileState>0</FileState>
    <FileTemplate clsid="{6F6FBFC1-3F14-46CA-A269}">
    <properties>
        <obj name="TypeSettings" clsid="{6F6FBFC1-3F14-46CA-A269}">
            <properties>
                <prop name="Enumerator" type="8">en0</prop>
                <prop name="Name" type="8">en0</prop>
                <prop name="Type" type="3">1</prop>
            </properties>
        </obj>
        <obj name="GeneralSettings" clsid="{6F6FBFC1-3F14-46CA-A269}">
            <properties>
                <prop name="BufferSize" type="21">524288000</prop>
                <prop name="FilePattern" type="8">auto_eth0</prop>
                <prop name="FileSize" type="21">1048576</prop>
                <prop name="MaxFileAge" type="11">-1</prop>
                <prop name="MaxTotalFileSize" type="11">0</prop>
                <prop name="Name" type="8">auto-en0</prop>
                <prop name="Owner" type="8">root</prop>
            </properties>
        </obj>
    </properties>
       </FileTemplate>
   </autostart>

我想在'GeneralSettings'下获取属性的属性值，我尝试了以下代码，但对我不起作用。是否有更容易使用的解析器？

    >>> import xml.etree.ElementTree as ET
    >>> tree = ET.parse("test.xml")
    >>> doc = tree.getroot()
    >>> 
    >>> for elem in doc.findall('autostart/FileTemplate/properties/obj/properties/prop'):
    ...     print elem.get('name="BufferSize"'), elem.text
    ... 
    >>>

Answer 1

我最熟悉BeautifulSoup，这很简单：

from bs4 import BeautifulSoup

with open('test.xml') as f:
    soup = BeautifulSoup(f)

gs = soup.find(attrs={'name':'GeneralSettings'})
for prop in gs.findAll('prop'):
    print(prop.text)

524288000
auto_eth0
1048576
-1
0
auto-en0
root

这并不是说bs4是python中xml解析的最终结果，它只是一个非常友好的API。

Answer 2

BeautifulSoup非常适合这个！

import bs4
wanted = "BufferedSize"
#doc = your html string
soup = bst.BeautifulSoup(doc)
#get all prop inside properties
props = soup.findall('prop')
for prop in props:
    name = prop["name"]
    if name == wanted:
        print prop["name"], prop.text

如果您需要任何解释，请告诉我们。

Answer 3

您的问题是您在搜索路径中有autostart，那是您的根。如果您将该行修改为：

for elem in doc.findall('FileTemplate/properties/obj[@name="GeneralSettings"]/properties/prop[@name="BufferSize"]'):
    print elem.text

应该按预期工作。

Answer 4

问题出在路径表达式中。在这里我的尝试：

import xml.etree.ElementTree as ET

doc = ET.parse('test.xml')
for prop_node in doc.iterfind('FileTemplate/properties/obj[@name="GeneralSettings"]/properties/prop[@name="BufferSize"]'):
    print 'Name:', prop_node.get('name'),
    print 'Type:', prop_node.get('type'),
    print 'Text:', prop_node.text

输出

名称：BufferSize类型：21文字：524288000

注释

上述表达式指定 obj 节点，其属性名称为 GeneralSettings 。
对于找到的每个节点，使用.get()方法获取每个属性，或者您可以使用.items()方法获取所有这些属性。

Python - 如何在xml中解析'prop name'元素

4 个答案:

输出

注释