在python中以不同的行长转置csv

时间:2015-07-21 05:08:35

标签: python csv transpose

我有许多具有可变长度行的csv文件。例如以下内容:

Time,0,8,18,46,132,163,224,238,267,303
X,0,14,14,14,15,16,17,15,15,15
Time,0,4,13,22,32,41,50,59,69,78,87,97,106,115,125,127,137,146,155,165,174,183,192,202,211,220,230,239,248,258,267,277,289,298,308
Y,0,0,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1
Time,0,4,13,22,32,41,50,59,69,78,87,97,106,115,125,127,137,146,155,165,174,183,192,202,211,220,230,239,248,258,267,277,289,298,308
Z,0,1,2,1,1,1,1,1,1,2,2,1,0,1,1,2,2,2,2,2,1,1,2,2,2,1,1,1,1,1,2,2,2,2,2
Time,0,308
W,0,0

变为:

Time,X,Time,Y,Time,Z,Time,W
0,0,0,0,0,0,0,0
8,14,4,0,4,1,308,0

许多数据已经丢失,只占用了前两个数据。

我想在python中转置这个CSV。我有以下程序:

import csv
import os
from itertools import izip
import sys

try:
    filename = sys.argv[1]
except IndexError:
    print 'Please add a filename'
    exit(-1)
with open(os.path.splitext(filename)[0] + '_t.csv', 'wb') as outfile, open(filename, 'rb') as infile:
    a = izip(*csv.reader(infile))
    csv.writer(outfile).writerows(a)

然而,它似乎削减了大量数据,因为文件从20KB下降到6KB并且只保持最小行长度。

如何不丢弃任何数据?

2 个答案:

答案 0 :(得分:1)

izip根据最短的数组进行拉链,因此您只获得每行中最短数组长度的值。

你应该使用izip_longest而不是那个,它用最长的数组拉链,并且在没有值的地方放置None。

示例 -

import csv
import os
from itertools import izip_longest
import sys

try:
    filename = sys.argv[1]
except IndexError:
    print 'Please add a filename'
    exit(-1)
with open(os.path.splitext(filename)[0] + '_t.csv', 'wb') as outfile, open(filename, 'rb') as infile:
    a = izip_longest(*csv.reader(infile))
    csv.writer(outfile).writerows(a)

我从中获得了结果 -

Time,X,Time,Y,Time,Z,Time,W

0,0,0,0,0,0,0,0

8,14,4,0,4,1,308,0

18,14,13,1,13,2,,

46,14,22,1,22,1,,

132,15,32,1,32,1,,

163,16,41,1,41,1,,

224,17,50,1,50,1,,

238,15,59,1,59,1,,

267,15,69,1,69,1,,

303,15,78,1,78,2,,

,,87,1,87,2,,

,,97,1,97,1,,

,,106,1,106,0,,

,,115,1,115,1,,

,,125,1,125,1,,

,,127,1,127,2,,

,,137,1,137,2,,

,,146,1,146,2,,

,,155,1,155,2,,

,,165,1,165,2,,

,,174,1,174,1,,

,,183,1,183,1,,

,,192,1,192,2,,

,,202,1,202,2,,

,,211,1,211,2,,

,,220,1,220,1,,

,,230,1,230,1,,

,,239,1,239,1,,

,,248,1,248,1,,

,,258,1,258,1,,

,,267,1,267,2,,

,,277,1,277,2,,

,,289,1,289,2,,

,,298,1,298,2,,

,,308,1,308,2,,

答案 1 :(得分:0)

这是一种没有itertools.izip的方法:

import csv

with open('transpose.csv') as infile, \
        open('out.csv', 'w') as outfile:
    reader = csv.reader(infile)
    writer = csv.writer(outfile)
    while True:
        try:
            index = next(reader)
            data = next(reader)
        except StopIteration:
            break
        writer.writerows(zip(index, data))

根据您的输入,此代码段会生成以下out.csv

Time,X
568,0
573,0
577,1
581,1
585,0
590,2
594,0
599,0
603,0
Time,Y
590,0
594,3
599,3
03,0
Time,Z
599,0
603,1

这是你想要的吗?

更新

此修改后的示例应与您更新的问题相符:

import csv
from itertools import zip_longest  # izip_longest in Python 2

with open('transpose.csv') as infile, \
        open('out.csv', 'w') as outfile:
    reader = csv.reader(infile)
    writer = csv.writer(outfile)

    writer.writerows(zip_longest(*reader, fillvalue=0))

fillvalue更新为您要用。替换缺失值的内容。