我有一个非常大的xml文件,并希望根据childnode文本获取一些记录。让我们看看我有一个xml以下,我想得到价格值,如果项目味道好。 (好) 我尝试使用minidom和ET.ElementTree但找不到合适的方法。
我想做那样的事情;
from xml.dom.minidom import parse, parseString
dom = parse( "file.xml" )
for node in dom.getElementsByTagName('food'):
node_child=node.getAttribute('description')
taste=node_child.getAttribute('taste')
if taste=='good':
price=node.getAttribute('price')
有什么想法吗?
<breakfast_menu>
<food>
<name>Belgian Waffles</name>
<price>$5.95</price>
<description>
<taste>good</taste>
<sight>bad</sight>
</description>
<calories>650</calories>
</food>
<food>
<name>Strawberry Belgian Waffles</name>
<price>$7.95</price>
<description>
<taste>bad</taste>
<sight>bad</sight>
</description>
<calories>900</calories>
</food>
<food>
<name>Berry-Berry Belgian Waffles</name>
<price>$8.95</price>
<description>
<taste>good</taste>
<sight>good</sight>
</description>
<calories>900</calories>
</food>
<food>
<name>French Toast</name>
<price>$4.50</price>
<description>
<taste>good</taste>
<sight>bad</sight>
</description>
<calories>600</calories>
</food>
答案 0 :(得分:1)
您可以使用lxml来解析它。
<强>代码:强>
from lxml import html
data = """
<breakfast_menu>
<food>
<name>Belgian Waffles</name>
<price>$5.95</price>
<description>
<taste>good</taste>
<sight>bad</sight>
</description>
<calories>650</calories>
</food>
<food>
<name>Strawberry Belgian Waffles</name>
<price>$7.95</price>
<description>
<taste>bad</taste>
<sight>bad</sight>
</description>
<calories>900</calories>
</food>
<food>
<name>Berry-Berry Belgian Waffles</name>
<price>$8.95</price>
<description>
<taste>good</taste>
<sight>good</sight>
</description>
<calories>900</calories>
</food>
<food>
<name>French Toast</name>
<price>$4.50</price>
<description>
<taste>good</taste>
<sight>bad</sight>
</description>
<calories>600</calories>
</food>
"""
tree = html.fromstring(data)
tastes = tree.xpath("//taste")
for taste in tastes:
foodparent = taste.getparent().getparent()
name = foodparent.xpath("name")[0].text
if taste.text == "good":
price = foodparent.xpath("price")[0].text
print "%s: %s" % (name, price)
else:
print "%s: %s" % (name, "Taste is bad, yuck.")
<强>结果:强>
Belgian Waffles: $5.95
Strawberry Belgian Waffles: Taste is bad, yuck.
Berry-Berry Belgian Waffles: $8.95
French Toast: $4.50
[Finished in 0.1s]
如果有帮助,请告诉我们。
答案 1 :(得分:0)
以下是使用ElementTree的解决方案
import xml.etree.ElementTree as et
tree = et.parse('breakfast.xml')
root = tree.getroot()
for food in root.findall('food'):
if food.find('description').find('taste').text == 'good':
price = food.find('price').text
print "found good food:{0} at price {1}".format(food.find('name').text, price)
结果:
found good food:Belgian Waffles at price $5.95
found good food:Berry-Berry Belgian Waffles at price $8.95
found good food:French Toast at price $4.50
编辑:我还必须修复你的xml,因为你错过了结束标记
答案 2 :(得分:0)
假设您的xml存储在名为xml_string
的字符串变量中,因此使用ElementTree
和XPath,您可以选择包含的所有 food 元素带有品味元素的 description 元素,其值为“good”。然后,您可以从 food 元素中提取所需的任何信息。
from xml.etree import ElementTree
tree = ElementTree.fromstring(xml_string)
food_elements = tree.findall('.//food/description[taste="good"]/..')
prices = [(food.find('name').text, food.find('price').text) for food in food_elements]
print(prices)
打印出来:
[('Belgian Waffles', '$5.95'), ('Berry-Berry Belgian Waffles', '$8.95'), ('French Toast', '$4.50')]