我正在读取XML文件,并尝试从中提取数据。我需要具有所有与“描述”类型相关联的特定ID的“描述”的所有实例,然后需要“描述”的值或文本。
des_id =获取项目的所有描述ID
des_value =获取这些ID的所有值
des_ids字典=将des_id与字典中的des_value关联
描述字典=将商品编号与des_ids字典关联
当我要导出为.csv时,如果看起来是xml文件描述数据中的最后一个项目编号,则要遍历列表中的所有部分。我需要使用文件中每个项目编号的值的各个描述代码。
在我知道它弄乱了我的意图的地方显示一段代码。
for part in soup.find_all('Item'):
for x in part.find_all('PartNumber'):
partNum = x.get_text()
for description in part.find_all('Description'):
des_id = description.get('DescriptionCode') # key
des_value = description.get_text() # value
des_ids[des_id] = des_value
descriptions[partNum] = [des_ids]
非常感谢您的帮助!
示例XML:
<Items>
<Item MaintenanceType="A">
<PartNumber>2000207</PartNumber>
<Descriptions>
<Description MaintenanceType="A" LanguageCode="EN" DescriptionCode="ASC">Here's a sample ASC description for 2000207</Description>
<Description MaintenanceType="A" LanguageCode="EN" DescriptionCode="ASM">2000207 sample ASM description</Description>
<Description MaintenanceType="A" LanguageCode="EN" DescriptionCode="DEF">2000207 Product</Description>
<Description MaintenanceType="A" LanguageCode="EN" DescriptionCode="DES">This is some text for 2000207</Description>
<Description MaintenanceType="A" LanguageCode="EN" DescriptionCode="EXT">Modified text kit for 2000207</Description>
<Description MaintenanceType="A" LanguageCode="EN" DescriptionCode="INV">Invoice desc for 2000207</Description>
<Description MaintenanceType="A" LanguageCode="EN" DescriptionCode="KEY">THE KEY for 2000207</Description>
<Description MaintenanceType="A" LanguageCode="EN" DescriptionCode="SHO">Pretty Short description for 2000207</Description>
</Descriptions>
</Item>
<Item MaintenanceType="A">
<HazardousMaterialCode>N</HazardousMaterialCode>
<PartNumber>2000408</PartNumber>
<Descriptions>
<Description MaintenanceType="A" LanguageCode="EN" DescriptionCode="ASC">Here's a sample ASC description for 2000208</Description>
<Description MaintenanceType="A" LanguageCode="EN" DescriptionCode="ASM">2000208 sample ASM description</Description>
<Description MaintenanceType="A" LanguageCode="EN" DescriptionCode="DEF">2000208 Product</Description>
<Description MaintenanceType="A" LanguageCode="EN" DescriptionCode="DES">This is some text for 2000208</Description>
<Description MaintenanceType="A" LanguageCode="EN" DescriptionCode="EXT">Modified text kit for 2000208</Description>
<Description MaintenanceType="A" LanguageCode="EN" DescriptionCode="INV">Invoice desc for 2000208</Description>
<Description MaintenanceType="A" LanguageCode="EN" DescriptionCode="KEY">THE KEY for 2000208</Description>
<Description MaintenanceType="A" LanguageCode="EN" DescriptionCode="SHO">Pretty Short description for 2000208</Description
</Description>
</Item>
</Items>
Ouput是带有PartNum的3列csv | des_id | des_value
答案 0 :(得分:0)
问题:我需要单独的描述代码,并带有文件中每个商品编号的值。
for part in soup.find_all('Item'):
for x in part.find_all('PartNumber'):
partNum = x.get_text()
print("partNum{}".format(partNum))
for description in part.find_all('Description'):
des_id = description.get('DescriptionCode') # key
des_value = description.get_text() # value
csv_data = {'PartNumber': partNum, 'DescriptionCode': des_id, 'Description': des_value}
print("\t{}".format(csv_data))
输出:
partNum2000207
{'DescriptionCode': 'ASC', 'PartNumber': '2000207', 'Description': "Here's a sample ASC description for 2000207"}
{'DescriptionCode': 'ASM', 'PartNumber': '2000207', 'Description': '2000207 sample ASM description'}
{'DescriptionCode': 'DEF', 'PartNumber': '2000207', 'Description': '2000207 Product'}
... (omitted for brevity)
partNum2000408
{'DescriptionCode': 'ASC', 'PartNumber': '2000408', 'Description': "Here's a sample ASC description for 2000208"}
{'DescriptionCode': 'ASM', 'PartNumber': '2000408', 'Description': '2000208 sample ASM description'}
{'DescriptionCode': 'DEF', 'PartNumber': '2000408', 'Description': '2000208 Product'}
... (omitted for brevity)