Question

我有一个要转换为csv的xml，但出现错误。

在我的xml文件中，我只希望将选定的列写入csv。

import xml.etree.ElementTree as ET
import pandas as pd

root = ET.parse('D:\\Task\\09_ActionRecorder_0.XML').getroot()

tags =[]
for elem in root:
    for child in elem:
        try:
            tag = {}
            tag["TL"] = child.attrib['TL']
            tag["CN"] = child.attrib['CN']
            tag["DT"] = child.attrib['DT']
            tag["AN"] = child.attrib['AN']
            tags.append(tag)

        except KeyError:
            tags.append(tag)
print(tags)
df_users = pd.DataFrame(tags)
#df_users.head(20)


column_name_update = df_users.rename(columns = {"TL": "Title", 
                                  "CN":"Control Name", 
                                  "DT": "Date Time",
                                  "AN": "Application Name"}) 


#new_data.head(20)

column_name_update.to_csv("D:\\Tasks\\Sample.csv",index=False, columns=["Title", 'Control Name', 'Date Time', 'Application Name'])

从给定的xml文件中，我只希望写有限的列数（如代码所示）。但是每当我执行上述代码时，我都会遇到关键错误，并且在csv文件中只写了一列。任何人都知道该怎么做。

Answer 1

遍历xml文件列表，并将每个文件转换为csv

import xml.etree.ElementTree as ET

ATTRIBUTES = ['TL', 'CN', 'DT', 'AN']
data = []
# TODO populate the list - https://docs.python.org/2/library/os.html#os.listdir
list_of_files = []
for file_name in list_of_files:
    root = ET.parse(file_name)
    recs = root.findall('.//Rec')
    for rec in recs:
        data.append([rec.attrib.get(attr, 'N/A') for attr in ATTRIBUTES])
    with open('{}.csv'.format(file_name), 'w') as f:
        f.write('Title,Control Name,Date Time,Application Name' + '\n')
        for entry in data:
            f.write(','.join(entry) + '\n')
   data = []

Answer 2

几个月前，我可能遇到类似的问题，最终我只是使用excel将文件另存为CSV，但是对于您而言，我知道这可能不切实际。我建议您使用python文件首先使用bash脚本将其转换为CSV（也可以在power shell中使用）然后遍历您的CSV文件。

This is how to create the bash script

This is how you can run the script from your python file

希望这会有所帮助

如何使用python将xml文件转换为csv文件

2 个答案: