我正在使用python,我需要为sku
的每次出现找到min-order-qty
,step-quantity
和sku
。
输入文件是:
<product sku="1235997403">
<sku>1235997403</sku>
<name xml:lang="fr-FR">Huile pour entretien des destructeurs de documents HSM</name>
<short-description xml:lang="fr-FR">Flacon 250 ml. Colis de 1 flacon.</short-description>
<category-links>
<category-link name="20319647o.rjpf_20320074o.rjpf" domain="RAJA-FR-WEB-0092-21" default = "1" hotdeal = "0"/>
</category-links>
<online>1</online>
<quantity unit="pcs">
<min-order-quantity>1</min-order-quantity>
<step-quantity>1</step-quantity>
</quantity>
....
</product>
....
我尝试使用lxml但未能获得min-order-qty和step-quantity
from lxml import etree
tree = etree.parse('./ST2CleanCourt.xml')
elem = tree.getroot()
for child in elem:
print (child.attrib["sku"])
我尝试使用以下2种解决方案。它工作但我需要读取文件,所以我写
from lxml import etree
import codecs
f=codecs.open('./ST2CleanCourt.xml','r','utf-8')
fichier = f.read()
tree = etree.fromstring(fichier)
for child in tree:
print ('sku:', child.attrib['sku'])
print ('min:', child.find('.//min-order-quantity').text)
我总是得到这个错误 print('min:',child.find('.// min-order-quantity')。text) AttributeError:'NoneType'对象没有属性'text'
有什么不对?
答案 0 :(得分:2)
您可以使用 xpath 方法获取所需的值。
示例:强>
from lxml import etree
a = """<product sku="1235997403">
<sku>1235997403</sku>
<name xml:lang="fr-FR">Huile pour entretien des destructeurs de documents HSM</name>
<short-description xml:lang="fr-FR">Flacon 250 ml. Colis de 1 flacon.</short-description>
<category-links>
<category-link name="20319647o.rjpf_20320074o.rjpf" domain="RAJA-FR-WEB-0092-21" default = "1" hotdeal = "0"/>
</category-links>
<online>1</online>
<quantity unit="pcs">
<min-order-quantity>1</min-order-quantity>
<step-quantity>1</step-quantity>
</quantity>
</product>
"""
tree = etree.fromstring(a)
tags = tree.xpath('/product')
for b in tags:
print b.attrib["sku"]
min_order = b.xpath("//quantity/min-order-quantity")
print min_order[0].text
step_quality = b.xpath("//quantity/step-quantity")
print step_quality[0].text
<强>输出:强>
1235997403
1
1
答案 1 :(得分:1)
使用超过1个产品和产品的根节点,您可以找到:
x = """
<products>
<product sku="1235997403">
<sku>1235997403</sku>
<name xml:lang="fr-FR">Huile pour entretien des destructeurs de documents HSM</name>
<short-description xml:lang="fr-FR">Flacon 250 ml. Colis de 1 flacon.</short-description>
<category-links>
<category-link name="20319647o.rjpf_20320074o.rjpf" domain="RAJA-FR-WEB-0092-21" default = "1" hotdeal = "0"/>
</category-links>
<online>1</online>
<quantity unit="pcs">
<min-order-quantity>1</min-order-quantity>
<step-quantity>1</step-quantity>
</quantity>
</product>
<product sku="997403">
<sku>1235997403</sku>
<name xml:lang="fr-FR">Huile pour entretien des destructeurs de documents HSM</name>
<short-description xml:lang="fr-FR">Flacon 250 ml. Colis de 1 flacon.</short-description>
<category-links>
<category-link name="20319647o.rjpf_20320074o.rjpf" domain="RAJA-FR-WEB-0092-21" default = "1" hotdeal = "0"/>
</category-links>
<online>1</online>
<quantity unit="pcs">
<min-order-quantity>5</min-order-quantity>
<step-quantity>7</step-quantity>
</quantity>
</product>
</products>
"""
from lxml import etree
tree = etree.fromstring(x)
for child in tree:
print ("sku:", child.attrib["sku"])
print ("min:", child.find(".//min-order-quantity").text) # looks for node below
print ("step:" ,child.find(".//step-quantity").text) # child with the given name
基本上,您会查找具有正确名称的子项下面的任何节点并打印其文本。
输出:
sku:1235997403
min:1
step:1
sku:997403
min:5
step:7