我想解析XML文件并将某些部分写入csv文件。我会用python来做。我是编程和XML的新手。我读了很多书,但找不到解决这个问题的有用例子。
我的XML文件如下:
<Host name="1.1.1.1">
<Properties>
<tag name="id">1</tag>
<tag name="os">windows</tag>
<tag name="ip">1.11.111.1</tag>
</Properties>
<Report id="123">
<output>
Host is configured to get updates from another server.
Update status:
last detected: 2015-12-02 18:48:28
last downloaded: 2015-11-17 12:34:22
last installed: 2015-11-23 01:05:32
Automatic settings:.....
</output>
</Report>
<Report id="123">
<output>
Host is configured to get updates from another server.
Environment Options:
Automatic settings:.....
</output>
</Report>
</Host>
我的XML文件包含500个条目!我只想解析输出包含更新状态的XML块,因为我想将3个日期(最后检测到,最后下载并最后安装在CSV文件中)写入。我还要添加ID,操作系统和IP。
我在ElementTree库中尝试过,但是我无法过滤其中输出包含更新状态的element.text。目前,我能够从整个文件中提取所有文本和属性,但无法过滤其中输出包含更新状态,最后检测到,最后下载或最后安装的块。
任何人都可以提出一些建议以实现这一目标吗?
所需的输出:
id:1
os:windows
ip:1.11.111.1
last detected: 2015-12-02 18:48:28
last downloaded: 2015-11-17 12:34:22
last installed:2015-11-23 01:05:32
所有这些信息都以.csv文件格式编写
此刻我的代码如下:
#!/usr/bin/env python
import xml.etree.ElementTree as ET
import csv
tree = ET.parse("file.xml")
root = tree.getroot()
# open csv file for writing
data = open('test.csv', 'w')
# create csv writer object
csvwriter = csv.writer(data)
# filter xml file
for tag in root.findall(".Host/Properties/tag[@name='ip']"):print(tag.text) # gives all ip's from whole xml
for output in root.iter('output'):print(plugin.text) # gives all outputs from whole xml
data.close()
最诚挚的问候
答案 0 :(得分:0)
当您从<Host>
元素开始并逐步下降时,这相对简单。
迭代所有节点,但仅在子串"Update status:"
出现在<output>
的值中时输出:
for host in tree.iter("Host"):
host_id = host.find('./Properties/tag[@name="id"]')
host_os = host.find('./Properties/tag[@name="os"]')
host_ip = host.find('./Properties/tag[@name="ip"]')
for output in host.iter("output"):
if output.text is not None and "Update status:" in output.text:
print("id:" + host_id.text)
print("os:" + host_os.text)
print("ip:" + host_ip.text)
for line in output.text.splitlines():
if ("last detected:" in line or
"last downloaded" in line or
"last installed" in line):
print(line.strip())
将其输出为您的示例XML:
id:1
os:windows
ip:1.11.111.1
last detected: 2015-12-02 18:48:28
last downloaded: 2015-11-17 12:34:22
last installed: 2015-11-23 01:05:32
次要点:并不是真正的CSV,因此按原样将其写入* .csv文件并不是很干净。