Question

我需要使用Python提取XML文档中属性的值。

例如，如果我有这样的XML文档：

<xml>
    <child type = "smallHuman"/>
    <adult type = "largeHuman"/>
</xml>

我如何才能将'smallHuman'或'largeHuman'文本存储在变量中？

编辑：我对Python很陌生，可能需要很多帮助。

这是我到目前为止所尝试的：

#! /usr/bin/python

import xml.etree.ElementTree as ET


def walkTree(node):
    print node.tag
    print node.keys()
    print node.attributes[]
    for cn in list(node):
        walkTree(cn)

treeOne = ET.parse('tm1.xml')
treeTwo = ET.parse('tm3.xml')

walkTree(treeOne.getroot())

由于这个脚本的使用方式，我无法将XML硬编码到.py文件中。

Answer 1

使用ElementTree，您可以使用查找方法＆amp; attrib 。

示例：

import xml.etree.ElementTree as ET z = """<xml> <child type = "smallHuman"/> <adult type = "largeHuman"/> </xml>""" treeOne = ET.fromstring(z) print treeOne.find('./child').attrib['type'] print treeOne.find('./adult').attrib['type']

<强>输出：

smallHuman largeHuman

Answer 2

要从XML获取属性值，您可以这样做：

public class kafkaConnection2 {

public static void main(String[] args) {

    String URL = "spark://XXXXXXXXX";


    SparkConf conf = new SparkConf().setAppName("Kafka-test").setMaster(URL);
    JavaStreamingContext ssc = new JavaStreamingContext(conf, Durations.seconds(5));


    Map<String, Object> kafkaParams = new HashMap<>();
    kafkaParams.put("bootstrap.servers", "XXXXXXXX");
    kafkaParams.put("key.deserializer", StringDeserializer.class);
    kafkaParams.put("value.deserializer", StringDeserializer.class);
    kafkaParams.put("group.id", "ID1");

    Collection<String> topics = Arrays.asList("MAX_LEGO");

     JavaInputDStream<ConsumerRecord<String, String>> stream = KafkaUtils.createDirectStream(ssc,
            LocationStrategies.PreferConsistent(),
            ConsumerStrategies.<String, String>Subscribe(topics, kafkaParams));     



    stream.mapToPair(record -> new Tuple2<>(record.key(), record.value()));
    stream.count();
    stream.map()
    ssc.start();



 try {
        ssc.awaitTerminationOrTimeout(10000);
    } catch (InterruptedException e) {
        System.out.println("smth went terribly wrong");
        e.printStackTrace();
    }


}
}

您可以在以下链接中找到更多详细信息和示例： https://docs.python.org/3.5/library/xml.etree.elementtree.html

Answer 3

使用lxml库的另一个示例：

admin.user.has_perm()

Answer 4

使用SimplifiedDoc库的另一个示例：

from simplified_scrapy import SimplifiedDoc, utils
xml = '''<xml>
    <child type = "smallHuman"/>
    <adult type = "largeHuman"/>
</xml>'''
doc = SimplifiedDoc(xml).select('xml')

# first
child_type = doc.child['type']
print(child_type)

adult_type = doc.adult['type']
print(adult_type)

# second
child_type = doc.select('child').get('type')
adult_type = doc.select('adult').get('type')

print(child_type)
print(adult_type)

# second
child_type = doc.select('child>type()')
adult_type = doc.select('adult>type()')

print(child_type)
print(adult_type)

# third
nodes = doc.selects('child|adult>type()')
print(nodes)
# fourth
nodes = doc.children
print ([node['type'] for node in nodes])

如何在Python中提取XML属性的值？

4 个答案: