Python - 按原始CSV字段名称对从CSV派生的XML进行排序

时间:2015-06-25 05:31:49

标签: python xml sorting csv

我从CSV文件中导出了XML。我有一个问题,确保XML的排序顺序与CSV标头相同。问题是DictReader没有维护排序,但是,如果我使用fieldnames,我会遇到'str' object has no attribute 'items'的问题。

我的CSV文件包含以下内容:

FieldA,FieldB,FieldC,FieldD
1,asdf,2,ghjk
3,qwer,4,yuio
5,slslkd,,aldkjslkj

我的Python脚本如下:

    import gzip
    import csv
    from xml.etree.ElementTree import Element, SubElement, tostring

    csv_file = 'Workbook1.csv.gz'

    class GZipCSVReader:
        def __init__(self, filename):
            self.gzfile = gzip.open(filename)
            self.reader = csv.DictReader(self.gzfile)
            self.fieldnames = self.reader.fieldnames

        def next(self):
            return self.reader.next()

        def close(self):
            self.gzfile.close()

        def __iter__(self):
            return self.reader.__iter__()


    def to_xml(r):
        for row in r.fieldnames:
            element = Element('event') # parent element is required
            children = [] # reset the list with each new row

            # Iterate through key:value pairs for each row and create a sub-element
            for (k, v) in row.items():
                if v:
                    sub = SubElement(element, k) # adds the column header as the sub
                    sub.text = v # adds row value as sub-element text

            # Create a list of sub-elements, minus the parent.
            for child in list(element):
                children.append(tostring(child))
            event_data = ''.join(children) # this creates a string of data to be passed to the server
            print (event_data + '\n')
        r.close()

if __name__ == '__main__':
    r = GZipCSVReader(csv_file)
    to_xml(r)

上面的代码将每个CSV行打印为XML SubElements。如果您注意到,SubElements的顺序与CSV标头不同,如果我尝试fieldnames,则会收到错误'str' object has no attribute 'items'。有没有解决方法,所以我可以得到与CSV标题相同的结果XML?

感谢。

1 个答案:

答案 0 :(得分:0)

你做错了,当你这么做时 - for row in r.fieldnames - row实际上是fieldname,而不是元素。

您需要做的是迭代r,对于r中的每一行,迭代字段名,然后将row[fieldname]作为值并fieldname作为键并创建子元素。

示例功能 -

def to_xml(r):
        for row in r:
            element = Element('event') # parent element is required
            children = [] # reset the list with each new row

            # Iterate through key:value pairs for each row and create a sub-element
            for k in r.fieldnames:
                v = row[k]
                if v:
                    sub = SubElement(element, k) # adds the column header as the sub
                    sub.text = v # adds row value as sub-element text

            # Create a list of sub-elements, minus the parent.
            for child in list(element):
                children.append(tostring(child))
            event_data = ''.join(children) # this creates a string of data to be passed to the server
            print (event_data + '\n')
        r.close()