Python使用ElementTree提取包含标记的节点

时间:2013-09-05 09:57:57

标签: python xml xpath elementtree

我需要从XML中提取几个节点,如果其中一个节点包含关键字。最后,我得指出,如果找到,我将打印关键字。现在是棘手的部分(至少对我而言;-))。我将在下面详细解释。 XML:

<?xml version="1.0"?>
<ItemSearchResponse xmlns="http://url">
  <Items>
    <Item>
      <ItemAttributes>
        <ListPrice>
          <Amount>2260</Amount>
        </ListPrice>
      </ItemAttributes>
      <Offers>
        <Offer>
          <OfferListing>
            <Price>
              <Amount>1853</Amount>
            </Price>
          </OfferListing>
        </Offer>
      </Offers>
      <Offers>
        <Offer>
          <OfferListing>
            <Price>
              <Amount>1853</Amount>
            </Price>
          </OfferListing>
        </Offer>
      </Offers>
      <Offers>
        <Offer>
          <OfferListing>
            <Price>
              <Amount>1200</Amount>
            </Price>
          </OfferListing>
        </Offer>
      </Offers>
    </Item>
  </Items>
</ItemSearchResponse>

我的脚本打印出Amount值,如果找到并且== 1853.我真正需要的是:当找到1853时 - 脚本应该将整个<Offers>提取到新文件。我让脚本运行并卡住了。我真的不知道如何从<Amount>返回并复制整个<Offers>组。

脚本1:

import xml.etree.ElementTree as ET
import sys

name = str.strip(sys.argv[1])
filename = str.strip(sys.argv[2])

fp = open("sample.xml","r")
element = ET.parse(fp)

for elem in element.iter():
    if elem.tag == '{http://url}Price':
        output = {}
        for elem1 in list(elem):
            if elem1.tag == '{http://url}Amount':
                if elem1.text == name:
                    output['Amount'] = elem1.text
                    print output

我的输出:

python sample1.py '1853' x
{'Amount': '1853'}
{'Amount': '1853'}

这里的'x'是不相关的。

如何从<Amount>返回并将整个<Offers>群组复制到新文件或只打印出来。需要使用ElementTree完成。

1 个答案:

答案 0 :(得分:3)

这个怎么样:

import xml.etree.ElementTree as ET
import sys

name = str.strip(sys.argv[1])
filename = str.strip(sys.argv[2])

fp = open("sample.xml","r")
tree = ET.parse(fp)
root = tree.getroot()

for offers in root.findall('.//{http://url}Offers'):
    value_found = False
    for amount in offers.findall('.//{http://url}Amount'):
        if amount.text == name:
            value_found = True
            break
    if value_found:
        print ET.tostring(offers)

打印

<url:Offers xmlns:url="http://url">
    <url:Offer>
      <url:OfferListing>
        <url:Price>
          <url:Amount>1853</url:Amount>
        </url:Price>
      </url:OfferListing>
    </url:Offer>
  </url:Offers>

<url:Offers xmlns:url="http://url">
    <url:Offer>
      <url:OfferListing>
        <url:Price>
          <url:Amount>1853</url:Amount>
        </url:Price>
      </url:OfferListing>
    </url:Offer>
  </url:Offers>

要写入文件,您可以执行以下操作:(借用this answer

for i, offers in enumerate(root.findall('.//{http://url}Offers'), start=1):
    value_found = False
    for amount in offers.findall('.//{http://url}Amount'):
        if amount.text == name:
            value_found = True
            break
    if value_found:
        tree = ET.ElementTree(offers)
        tree.write("offers%d.xml" % i,
           xml_declaration=True, encoding='utf-8',
           method="xml", default_namespace='http://url')

写文件如:

<?xml version='1.0' encoding='utf-8'?>
<Offers xmlns="http://url">
    <Offer>
      <OfferListing>
        <Price>
          <Amount>1853</Amount>
        </Price>
      </OfferListing>
    </Offer>
  </Offers>