Python和产品文件XML

时间:2018-02-05 16:10:45

标签: python xml

我正在使用python,我需要为sku的每次出现找到min-order-qtystep-quantitysku

输入文件是:

<product sku="1235997403">
  <sku>1235997403</sku>
  <name xml:lang="fr-FR">Huile pour entretien des destructeurs de documents HSM</name>
  <short-description xml:lang="fr-FR">Flacon 250 ml. Colis de 1 flacon.</short-description>
  <category-links>
    <category-link name="20319647o.rjpf_20320074o.rjpf" domain="RAJA-FR-WEB-0092-21" default = "1" hotdeal = "0"/>
  </category-links>
  <online>1</online>
  <quantity unit="pcs">
    <min-order-quantity>1</min-order-quantity>
    <step-quantity>1</step-quantity>
  </quantity>
....
</product>
....

我尝试使用lxml但未能获得min-order-qty和step-quantity

from lxml import etree
tree = etree.parse('./ST2CleanCourt.xml')
elem = tree.getroot()
for  child in elem:
        print (child.attrib["sku"]) 

我尝试使用以下2种解决方案。它工作但我需要读取文件,所以我写

from lxml import etree
import codecs
f=codecs.open('./ST2CleanCourt.xml','r','utf-8')
fichier = f.read()
tree = etree.fromstring(fichier)

for child in tree:
    print ('sku:', child.attrib['sku'])
    print ('min:', child.find('.//min-order-quantity').text)

我总是得到这个错误     print('min:',child.find('.// min-order-quantity')。text) AttributeError:'NoneType'对象没有属性'text'

有什么不对?

2 个答案:

答案 0 :(得分:2)

您可以使用 xpath 方法获取所需的值。

示例:

from lxml import etree

a = """<product sku="1235997403">
  <sku>1235997403</sku>
  <name xml:lang="fr-FR">Huile pour entretien des destructeurs de documents HSM</name>
  <short-description xml:lang="fr-FR">Flacon 250 ml. Colis de 1 flacon.</short-description>
  <category-links>
    <category-link name="20319647o.rjpf_20320074o.rjpf" domain="RAJA-FR-WEB-0092-21" default = "1" hotdeal = "0"/>
  </category-links>
  <online>1</online>
  <quantity unit="pcs">
    <min-order-quantity>1</min-order-quantity>
    <step-quantity>1</step-quantity>
  </quantity>
</product>
"""

tree = etree.fromstring(a)
tags = tree.xpath('/product')

for b in tags:
    print b.attrib["sku"]
    min_order = b.xpath("//quantity/min-order-quantity")
    print min_order[0].text
    step_quality = b.xpath("//quantity/step-quantity")
    print step_quality[0].text

<强>输出:

1235997403
1
1

答案 1 :(得分:1)

使用超过1个产品和产品的根节点,您可以找到:

x = """
<products>
<product sku="1235997403">
  <sku>1235997403</sku>
  <name xml:lang="fr-FR">Huile pour entretien des destructeurs de documents HSM</name>
  <short-description xml:lang="fr-FR">Flacon 250 ml. Colis de 1 flacon.</short-description>
  <category-links>
    <category-link name="20319647o.rjpf_20320074o.rjpf" domain="RAJA-FR-WEB-0092-21" default = "1" hotdeal = "0"/>
  </category-links>
  <online>1</online>
  <quantity unit="pcs">
    <min-order-quantity>1</min-order-quantity>
    <step-quantity>1</step-quantity>
  </quantity>
</product>
<product sku="997403">
  <sku>1235997403</sku>
  <name xml:lang="fr-FR">Huile pour entretien des destructeurs de documents HSM</name>
  <short-description xml:lang="fr-FR">Flacon 250 ml. Colis de 1 flacon.</short-description>
  <category-links>
    <category-link name="20319647o.rjpf_20320074o.rjpf" domain="RAJA-FR-WEB-0092-21" default = "1" hotdeal = "0"/>
  </category-links>
  <online>1</online>
  <quantity unit="pcs">
    <min-order-quantity>5</min-order-quantity>
    <step-quantity>7</step-quantity>
  </quantity>
</product>
</products>
"""

from lxml import etree

tree = etree.fromstring(x) 
for  child in tree:
    print ("sku:", child.attrib["sku"])
    print ("min:", child.find(".//min-order-quantity").text)  # looks for  node below
    print ("step:" ,child.find(".//step-quantity").text)  # child with the given name

基本上,您会查找具有正确名称的子项下面的任何节点并打印其文本。

输出:

sku:1235997403
min:1
step:1
sku:997403
min:5
step:7 

Doku:http://lxml.de/tutorial.html#elementpath