我写了这段代码从.xml文件创建.csv报告,但是当我打开生成的.csv时,它是空白的。随意撕开我的代码,顺便说一下,我对此很陌生,想学习!
xml中有多个“ Subjectkeys”,但是只有一些具有“ AuditRecord”。我只想提取具有审计记录的数据,然后,对于这些数据,我想从“ SubjectData”,“ FormData”和“ AuditRecord”中提取其信息
import csv
import xml.etree.cElementTree as ET
tree = ET.parse("response.xml")
root = tree.getroot()
xml_data_to_csv =open("query.csv", 'w')
AuditRecord_head = []
SubjectData_head = []
FormData_head = []
csvwriter=csv.writer(xml_data_to_csv)
count=0
for member in root.findall("AuditRecord"):
AuditRecord = []
Subjectdata = []
FormData = []
if count == 0:
Subject = member.find("SubjectKey").tag
Subjectdata_head.append(Subject)
Form = member.find("p1Name").tag
FormData_head.append(Form)
Action = member.find("Action").tag
AuditRecord_head.append(Action)
csvwriter.writerow(Auditrecord_head)
count = count + 1
Subject = member.find('SubjectKey').text
Subjectdata.append(Subject)
Form = member.find('p1Name').text
FormData.append(Form)
Action = member.find("Action").text
AuditRecord.append(Action)
csvwriter.writerow(Subjectdata)
xml_data_to_csv.close()
我希望输出结果是一个表,列标题为:Subject,Form,Action。
以下是示例.xml:
</ClinicalData>
<ClinicalData StudyOID="SMK-869-002" MetaDataVersionOID="2.0">
<SubjectData SubjectKey="865-015">
</AuditRecord>
</FormData>
<FormData p1:Name="Medical History" p1:Started="Y" FormOID="mh" FormRepeatKey="0"/>
<FormData p1:Name="Medical History" p1:Started="Y" FormOID="mh" FormRepeatKey="1">
<p1:QueryAction InitialComment="Please enter start date for condition" UserType="User" UserOID="bailey@protocolfirst.com" Action="query" DateTimeStamp="2019-07-12T14:08:43.893Z"/>
</AuditRecord>
答案 0 :(得分:0)
首先,您的xml文件有很多错误,对我来说,它必须看起来像:
<?xml version="1.0"?>
<root xmlns:p1="http://some-url.com">
<ClinicalData StudyOID="SMK-869-002" MetaDataVersionOID="2.0"></ClinicalData>
<SubjectData SubjectKey="865-015"></SubjectData>
<AuditRecord>
<FormData p1:Name="Medical History" p1:Started="Y" FormOID="mh" FormRepeatKey="0"/>
<FormData p1:Name="Medical History" p1:Started="Y" FormOID="mh" FormRepeatKey="1"/>
<p1:QueryAction InitialComment="Please enter start date for condition" UserType="User" UserOID="bailey@protocolfirst.com" Action="query" DateTimeStamp="2019-07-12T14:08:43.893Z"/>
</AuditRecord>
</root>
ElementTree始终只希望有一个根节点和一个格式正确的文档。
我不太了解您要做什么,但是我希望这可以对您有所帮助:
import xml.etree.cElementTree as ET
tree = ET.parse("response.xml")
root = tree.getroot()
xml_data_to_csv = open("query.csv", 'w')
list_head=[]
count=0
for member in root.findall("AuditRecord"):
AuditRecord = []
Subjectdata = []
FormData = []
if count == 0:
Subjectdata.append(root.find('./SubjectData').attrib['SubjectKey'])
for formData in root.findall('./AuditRecord/FormData'):
#print(formData.attrib['{http://some-url.com}Name'])
FormData.append(formData.attrib['{http://some-url.com}Name'])
AuditRecord.append(root.find('./AuditRecord/{http://some-url.com}QueryAction').attrib['Action'])
xml_data_to_csv.write(Subjectdata[0] + "," + FormData[0] + "," + FormData[1] + "," + AuditRecord[0])
count = count + 1
xml_data_to_csv.close()
这将产生一个具有以下内容的csv文件:
865-015,Medical History,Medical History,query