对于验证目的:如何按节点(甚至是子节点)搜索整个XML节点,如下所示:
XML文件:
<Summary>
<Hardware_Info>
<HardwareType>FlashDrive</HardwareType>
<ManufacturerDetail>
<ManufacturerCompany>Company1</ManufacturerCompany>
<ManufacturerDate>2017-07-20T12:26:04-04:00</ManufacturerDate>
<ModelCode>4BR6282</ModelCode>
</ManufacturerDetail>
<ActivationDate>2017-07-20T12:26:04-04:00</ActivationDate>
</Hardware_Info>
<DeviceConnectionInfo>
<Device>
<Index>0</Index>
<Name>Laptop1</Name>
<Status>Installed</Status>
</Device>
<Device>
<Index>1</Index>
<Name>Laptop2</Name>
<Status>Installed</Status>
</Device>
</DeviceConnectionInfo>
</Summary>
并根据特定表的匹配列搜索值。为了示例,表格如下:
表格
HardwareType ManufacturerCompany ManufacturerDate ActivationDate Device.Index Name
FlashDrive Company1 2017-07-20T12:26:04-04:00 2017-07-20T12:26:04-04:00 0 Laptop1
FlashDrive Company2 2017-07-20T12:26:04-04:00 2017-07-20T12:26:04-04:00 1 Laptop2
在这种情况下,我会有一个列列表:
HardwareType, ManufacturerCompany, ManufacturerDate, ActivationDate, Device.Index, Name
对于我的最终结果,我想打印表列名的值以及在xml上找到的表名的值。例如类似于原始表(假设验证很好):
输出结果:
HardwareType ManufacturerCompany ManufacturerDate ActivationDate Device.Index Name
FlashDrive Company1 2017-07-20T12:26:04-04:00 2017-07-20T12:26:04-04:00 0 Laptop1
FlashDrive Company2 2017-07-20T12:26:04-04:00 2017-07-20T12:26:04-04:00 1 Laptop2
当前的实施:
例如,我能够获取表的列名列表,但到目前为止,我最好的实现这一点的是:
import xml.etree.ElementTree as ET
import csv
tree = ET.parse("/test.xml")
root = tree.getroot()
f = open('/test.csv', 'w')
csvwriter = csv.writer(f)
count = 0
head = ['ManufacturerCompany','ManufacturerDate',...]
csvwriter.writerow(head)
for time in root.findall('Summary'):
row = []
job_name = time.find('ManufacturerDetail').find('ManufacturerCompany').text
row.append(job_name)
job_name = time.find('ManufacturerDetail').find('ManufacturerDate').text
row.append(job_name)
csvwriter.writerow(row)
f.close()
但是,这个实现没有循环我想要输出的每个功能。任何实施的指导或建议都会很棒。
由于
答案 0 :(得分:1)
考虑XSLT,这是专门用于将XML文件转换为其他XML,HTML(主要用于)的专用语言,还包括用method="text"
转换的文本文件(TXT / CSV)。具体来说,向下走到设备节点级别并引入祖先项目。
Python的第三方lxml模块可以运行XSLT 1.0脚本。但是,XSLT是可移植的,任何 XSLT processor都可以运行这样的代码,包括Unix(Linux / Mac)可用的xsltproc。
XSLT (另存为.xsl文件,一个特殊的.xml文件; 

是换行符实体)
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output indent="yes" method="text"/>
<xsl:strip-space elements="*"/>
<xsl:param name="delimiter">,</xsl:param>
<xsl:template match="/Summary">
<xsl:text>HardwareType,ManufacturerCompany,ManufacturerDate,ActivationDate,Device.Index,Name
</xsl:text>
<xsl:apply-templates select="DeviceConnectionInfo"/>
</xsl:template>
<xsl:template match="DeviceConnectionInfo">
<xsl:apply-templates select="Device"/>
</xsl:template>
<xsl:template match="Device">
<xsl:value-of select="concat(ancestor::Summary/Hardware_Info/HardwareType, $delimiter,
ancestor::Summary/Hardware_Info/ManufacturerDetail/ManufacturerCompany, $delimiter,
ancestor::Summary/Hardware_Info/ManufacturerDetail/ManufacturerDate, $delimiter,
ancestor::Summary/Hardware_Info/ActivationDate, $delimiter,
Index, $delimiter,
Name)"/><xsl:text>
</xsl:text>
</xsl:template>
</xsl:stylesheet>
Python (使用lxml)
import lxml.etree as et
# LOAD XML AND XSL
doc = et.parse('input.xml')
xsl = et.parse('xslt_script.xsl')
# TRANSFORM INPUT TO STRING
transform = et.XSLT(xsl)
result = str(transform(doc))
# SAVE TO FILE
with open('output.csv', 'w') as f:
f.write(result)
Python (对xsltproc的单行命令调用)
from subprocess import Popen
proc = Popen(['xsltproc -o output.csv xslt_script.xsl input.xml'],
shell=True, cwd='/path/to/working/directory')
<强>输出强>
# HardwareType ManufacturerCompany ManufacturerDate ActivationDate Device.Index Name
# FlashDrive Company1 2017-07-20T12:26:04-04:00 2017-07-20T12:26:04-04:00 0 Laptop1
# FlashDrive Company1 2017-07-20T12:26:04-04:00 2017-07-20T12:26:04-04:00 1 Laptop2