用于将csv转换为xml的python脚本

时间:2018-06-19 10:51:02

标签: python

请帮助纠正python脚本以获得所需的输出

我写了下面的代码将csv转换为xml。 在输入文件中有1到278列。在输出文件中需要有从A1到A278的标记,

代码:

#!/usr/bin/python
import sys
import os
import csv
if len(sys.argv) != 2:
    os._exit(1)
path=sys.argv[1] # get folder as a command line argument
os.chdir(path)
csvFiles = [f for f in os.listdir('.') if f.endswith('.csv') or f.endswith('.CSV')]
for csvFile in csvFiles:
    xmlFile = csvFile[:-4] + '.xml'
    csvData = csv.reader(open(csvFile))
    xmlData = open(xmlFile, 'w')
    xmlData.write('<?xml version="1.0"?>' + "\n")
    # there must be only one top-level tag
    xmlData.write('<TariffRecords>' + "\n")
    rowNum = 0
    for row in csvData:
        if rowNum == 0:
            tags = Tariff
            # replace spaces w/ underscores in tag names
            for i in range(len(tags)):
                tags[i] = tags[i].replace(' ', '_')
        else:
            xmlData.write('<Tariff>' + "\n")
            for i in range(len(tags)):
                xmlData.write('    ' + '<' + tags[i] + '>' \
                              + row[i] + '</' + tags[i] + '>' + "\n")
            xmlData.write('</Tariff>' + "\n")
        rowNum +=1
    xmlData.write('</TariffRecords>' + "\n")
    xmlData.close()

从脚本中获取以下错误:

Traceback (most recent call last):
  File "ctox.py", line 20, in ?
    tags = Tariff
NameError: name 'Tariff' is not defined

示例输入文件。(这是实际输入文件中的示例记录,将包含278列)。 如果输入文件有两个或三个记录,则需要在一个XML文件中附加相同的记录。

name,Tariff Summary,Record ID No.,Operator Name,Circle (Service Area),list
Prepaid Plan Voucher,test_All calls 2p/s,TT07PMPV0188,Ta Te,Gu,
Prepaid Plan Voucher,test_All calls 3p/s,TT07PMPV0189,Ta Te,HR,

示例输出文件      以上两个关税记录,关税将在xml文件的开头和结尾进行硬编码。

<TariffRecords>
<Tariff>
<A1>Prepaid Plan Voucher</A1>
<A2>test_All calls 2p/s</A2>
<A3>TT07PMPV0188</A3>
<A4>Ta Te</A4>
<A5>Gu</A5>
<A6></A6>
</Tariff>
<Tariff>
<A1>Prepaid Plan Voucher</A1>
<A2>test_All calls 3p/s</A2>
<A3>TT07PMPV0189</A3>
<A4>Ta Te</A4>
<A5>HR</A5>
<A6></A6>
</Tariff>
</TariffRecords>

2 个答案:

答案 0 :(得分:0)

First off you need to replace

tags = Tariff with tags = row

Secondly you want to replace the write line to not write tags name but write A1, A2 etc..

Complete code:

import sys
import os
import csv
if len(sys.argv) != 2:
    os._exit(1)
path=sys.argv[1] # get folder as a command line argument
os.chdir(path)
csvFiles = [f for f in os.listdir('.') if f.endswith('.csv') or f.endswith('.CSV')]
for csvFile in csvFiles:
    xmlFile = csvFile[:-4] + '.xml'
    csvData = csv.reader(open(csvFile))
    xmlData = open(xmlFile, 'w')
    xmlData.write('<?xml version="1.0"?>' + "\n")
    # there must be only one top-level tag
    xmlData.write('<TariffRecords>' + "\n")
    rowNum = 0
    for row in csvData:
        if rowNum == 0:
            tags = row
            # replace spaces w/ underscores in tag names
            for i in range(len(tags)):
                tags[i] = tags[i].replace(' ', '_')
        else:
            xmlData.write('<Tariff>' + "\n")
            for i, index in enumerate(range(len(tags))):
                xmlData.write('    ' + '<' + 'A%s' % (index+1) + '>' \
                              + row[i] + '</' + 'A%s' % (index+1) + '>' + "\n")
            xmlData.write('</Tariff>' + "\n")
        rowNum +=1
    xmlData.write('</TariffRecords>' + "\n")
    xmlData.close()

Output:

<?xml version="1.0"?>
<TariffRecords>
<Tariff>
    <A1>Prepaid Plan Voucher</A1>
    <A2>test_All calls 2p/s</A2>
    <A3>TT07PMPV0188</A3>
    <A4>Ta Te</A4>
    <A5>Gu</A5>
    <A6></A6>
</Tariff>
<Tariff>
    <A1>Prepaid Plan Voucher</A1>
    <A2>test_All calls 3p/s</A2>
    <A3>TT07PMPV0189</A3>
    <A4>Ta Te</A4>
    <A5>HR</A5>
    <A6></A6>
</Tariff>
</TariffRecords>

答案 1 :(得分:-1)

import pandas as pd
from xml.etree import ElementTree as xml

df = pd.read_csv("file_path")
csv_data = df.values
root = xml.Element("TariffRecords")
tariff = xml.subelement("Tariff", root)
for index, data in enumarate(csv_data):
  row = xml.Element("A"+str(index), tariff)
  row.set(str(data))