Question

我正在使用python，我需要为sku的每次出现找到min-order-qty，step-quantity和sku。

输入文件是：

<product sku="1235997403">
  <sku>1235997403</sku>
  <name xml:lang="fr-FR">Huile pour entretien des destructeurs de documents HSM</name>
  <short-description xml:lang="fr-FR">Flacon 250 ml. Colis de 1 flacon.</short-description>
  <category-links>
    <category-link name="20319647o.rjpf_20320074o.rjpf" domain="RAJA-FR-WEB-0092-21" default = "1" hotdeal = "0"/>
  </category-links>
  <online>1</online>
  <quantity unit="pcs">
    <min-order-quantity>1</min-order-quantity>
    <step-quantity>1</step-quantity>
  </quantity>
....
</product>
....

我尝试使用lxml但未能获得min-order-qty和step-quantity

from lxml import etree
tree = etree.parse('./ST2CleanCourt.xml')
elem = tree.getroot()
for  child in elem:
        print (child.attrib["sku"])

我尝试使用以下2种解决方案。它工作但我需要读取文件，所以我写

from lxml import etree
import codecs
f=codecs.open('./ST2CleanCourt.xml','r','utf-8')
fichier = f.read()
tree = etree.fromstring(fichier)

for child in tree:
    print ('sku:', child.attrib['sku'])
    print ('min:', child.find('.//min-order-quantity').text)

我总是得到这个错误 print（'min：'，child.find（'.// min-order-quantity'）。text） AttributeError：'NoneType'对象没有属性'text'

有什么不对？

Answer 1

您可以使用 xpath 方法获取所需的值。

示例：

from lxml import etree a = """<product sku="1235997403"> <sku>1235997403</sku> <name xml:lang="fr-FR">Huile pour entretien des destructeurs de documents HSM</name> <short-description xml:lang="fr-FR">Flacon 250 ml. Colis de 1 flacon.</short-description> <category-links> <category-link name="20319647o.rjpf_20320074o.rjpf" domain="RAJA-FR-WEB-0092-21" default = "1" hotdeal = "0"/> </category-links> <online>1</online> <quantity unit="pcs"> <min-order-quantity>1</min-order-quantity> <step-quantity>1</step-quantity> </quantity> </product> """ tree = etree.fromstring(a) tags = tree.xpath('/product') for b in tags: print b.attrib["sku"] min_order = b.xpath("//quantity/min-order-quantity") print min_order[0].text step_quality = b.xpath("//quantity/step-quantity") print step_quality[0].text

<强>输出：

1235997403 1 1

Answer 2

使用超过1个产品和产品的根节点，您可以找到：

x = """
<products>
<product sku="1235997403">
  <sku>1235997403</sku>
  <name xml:lang="fr-FR">Huile pour entretien des destructeurs de documents HSM</name>
  <short-description xml:lang="fr-FR">Flacon 250 ml. Colis de 1 flacon.</short-description>
  <category-links>
    <category-link name="20319647o.rjpf_20320074o.rjpf" domain="RAJA-FR-WEB-0092-21" default = "1" hotdeal = "0"/>
  </category-links>
  <online>1</online>
  <quantity unit="pcs">
    <min-order-quantity>1</min-order-quantity>
    <step-quantity>1</step-quantity>
  </quantity>
</product>
<product sku="997403">
  <sku>1235997403</sku>
  <name xml:lang="fr-FR">Huile pour entretien des destructeurs de documents HSM</name>
  <short-description xml:lang="fr-FR">Flacon 250 ml. Colis de 1 flacon.</short-description>
  <category-links>
    <category-link name="20319647o.rjpf_20320074o.rjpf" domain="RAJA-FR-WEB-0092-21" default = "1" hotdeal = "0"/>
  </category-links>
  <online>1</online>
  <quantity unit="pcs">
    <min-order-quantity>5</min-order-quantity>
    <step-quantity>7</step-quantity>
  </quantity>
</product>
</products>
"""

from lxml import etree

tree = etree.fromstring(x) 
for  child in tree:
    print ("sku:", child.attrib["sku"])
    print ("min:", child.find(".//min-order-quantity").text)  # looks for  node below
    print ("step:" ,child.find(".//step-quantity").text)  # child with the given name

基本上，您会查找具有正确名称的子项下面的任何节点并打印其文本。

输出：

sku:1235997403
min:1
step:1
sku:997403
min:5
step:7

Doku：http://lxml.de/tutorial.html#elementpath

Python和产品文件XML

2 个答案: