我有一个非常嵌套的XML文件,必须迭代槽以提取记录。我已经按照一些示例读取XML,并且确定XML是固定长度的,但是经过一些提取,我发现它不是固定长度的。 这是我的代码:
import xml.etree.ElementTree as ET
tree = ET.parse('EcommProdotti.xml')
root = tree.getroot()
print("Printing on file...")
with open("prodotti.txt", "w") as f:
for child in root:
for element in child.iter('Products'):
for sub_element in element.iter('Product'):
length = len(sub_element) + 1
my_string = sub_element[1].text + " " + sub_element[2].text + " " + sub_element[9].text + "\n"
f.write(my_string)
如您所见,根据下面的XML File示例,我的记录位于sub_element节点中,并且它可能是可变的:
<?xml version="1.0" encoding="UTF-8"?>
<!-- File in formato Easyfatt-XML creato con Danea Easyfatt - www.danea.it/software/easyfatt -->
<!-- Per importare o creare un file in formato Easyfatt-Xml, consultare la documentazione tecnica: www.danea.it/software/easyfatt/xml -->
<EasyfattProducts AppVersion="2" Creator="Danea Easyfatt Enterprise One 2019.45d" CreatorUrl="http://www.danea.it/software/easyfatt" Mode="full" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:noNamespaceSchemaLocation="https://www.danea.it/public/prodotti.xsd">
<Products>
<Product>
<InternalID>35</InternalID>
<Code>00035</Code>
<Description>12 PEZZI ROTOLO SACCHETTO IGIENICO PER CANI</Description>
<DescriptionHtml></DescriptionHtml>
<Category>ACCESSORI PER ANIMALI</Category>
<Subcategory>PRODOTTI IGIENE E PULIZIA</Subcategory>
<Vat Perc="22" Class="Imponibile" Description="Imponibile 22%">22</Vat>
<Um>PZ</Um>
<NetPrice1>2.46</NetPrice1>
<GrossPrice1>3</GrossPrice1>
<Barcode>8805786177364</Barcode>
<SupplierCode>0004</SupplierCode>
<SupplierName>LED STORM SRLS</SupplierName>
<SupplierNetPrice>1.898</SupplierNetPrice>
<SupplierGrossPrice>2.3156</SupplierGrossPrice>
<SizeUm>cm</SizeUm>
<WeightUm>kg</WeightUm>
<GrossWeight>0.2</GrossWeight>
<ManageWarehouse>true</ManageWarehouse>
<OrderWaitDays>10</OrderWaitDays>
<AvailableQty>6</AvailableQty>
<Notes></Notes>
</Product>
<Product>
<InternalID>1155</InternalID>
<Code>01144</Code>
<Description>ADVANCE CANE ATOPIC MEDIUM/MAXI 3 KG</Description>
<DescriptionHtml></DescriptionHtml>
<Category>ALIMENTI PER ANIMALI</Category>
<Subcategory>ALIMENTI CURATIVI (DIETETICI)</Subcategory>
<Vat Perc="22" Class="Imponibile" Description="Imponibile 22%">22</Vat>
<Um>PZ</Um>
<NetPrice1>20.48</NetPrice1>
<GrossPrice1>24.99</GrossPrice1>
<Barcode>8410650170695</Barcode>
<ProducerName>Affinity</ProducerName>
<SupplierCode>0033</SupplierCode>
<SupplierName>LOCONTE VITO & C. S.A.S.</SupplierName>
<SupplierProductCode>ADV924483</SupplierProductCode>
<SupplierNetPrice>13.0438</SupplierNetPrice>
<SupplierGrossPrice>15.9134</SupplierGrossPrice>
<SizeUm>cm</SizeUm>
<WeightUm>kg</WeightUm>
<GrossWeight>3</GrossWeight>
<ManageWarehouse>true</ManageWarehouse>
<AvailableQty>8</AvailableQty>
<Notes></Notes>
</Product>
<Product>
<InternalID>203</InternalID>
<Code>00198</Code>
<Description>ADVANTIX CANE FINO A 4 KG</Description>
<DescriptionHtml></DescriptionHtml>
<Category>ACCESSORI PER ANIMALI</Category>
<Subcategory>ANTIPARASSITARI</Subcategory>
<Vat Perc="10" Class="Imponibile" Description="Imponibile 10%">10</Vat>
<Um>PZ</Um>
<NetPrice1>19.82</NetPrice1>
<GrossPrice1>21.8</GrossPrice1>
<Barcode>4007221009597</Barcode>
<ProducerName>Bayer</ProducerName>
<SupplierCode>0033</SupplierCode>
<SupplierName>LOCONTE VITO & C. S.A.S.</SupplierName>
<SupplierProductCode>BYR03382048</SupplierProductCode>
<SupplierNetPrice>16.25</SupplierNetPrice>
<SupplierGrossPrice>17.875</SupplierGrossPrice>
<SizeUm>cm</SizeUm>
<WeightUm>kg</WeightUm>
<GrossWeight>0.04</GrossWeight>
<ManageWarehouse>true</ManageWarehouse>
<OrderWaitDays>10</OrderWaitDays>
<AvailableQty>10</AvailableQty>
<Notes></Notes>
<ExtraBarcodes>
<Barcode>103629046</Barcode>
<Barcode>4007221046424</Barcode>
</ExtraBarcodes>
</Product>
</EasyfattProducts>
那么,如何遍历此文件?预先感谢您的时间和帮助
答案 0 :(得分:1)
您可以使用XPath查询来避免直接建立项目索引:
import xml.etree.ElementTree as ET
tree = ET.parse('EcommProdotti.xml')
root = tree.getroot()
for product in root.findall(".//Products/Product"):
for field in ['Code', 'Description', 'GrossPrice1','SupplierProductCode']:
value = product.find(field)
if value != None:
print (value.text, end=' ')
else:
print ('Not defined', end=' ')
print()