Python更改逗号分隔CSV

时间:2017-12-19 01:21:16

标签: python csv

NEWBIE使用PYTHON(2.7.9) - 当我使用以下命令将gzip压缩文件导出到csv时

myData = gzip.open('file.gz.DONE', 'rb') 
myFile = open('output.csv', 'wb') with myFile:
        writer = csv.writer(myFile)
        writer.writerows(myData)    
print("Writing complete")

它在csv中打印,并在每个字符中使用逗号分隔。例如。

S,V,R,","2,1,4,0,",",2,0,1,6,1,1,3,8,0,4,",",5,0,5,0,1,3,4,2,0,6,4,7,3,6,4,",",",",2,0,0,0,5,6,5,9,2,9,6,7,4,",",2,0,0,7,2,4,5,2,3,5,",",0,0,0,2,","
I,V,E,",",",",",",E,N,",",N,/,A,",",0,4,2,1,4,4,9,3,7,0,",":,I,R,_,",",N,/,A,",",U,N,A,N,S,W,",",",",",",",","
"
S,V,R,",",4,7,3,3,5,5,",",2,0,5,7,",",5,0,5,0,1,4,5,0,1,6,4,8,6,3,7,",",",",2,0,0,0,5,5,3,9,2,9,2,8,0,",",2,0,4,4,1,0,8,3,7,8,",",0,0,0,2,","
I,V,E,",",",",",",E,N,",",N,/,A,",",0,4,4,7,3,3,5,4,5,5,",",,:,I,R,_,",",N,/,A,",",U,N,A,N,S,W,",",",",",",",","

如何删除逗号以便使用正确的字段导出?例如。

  

SVR,2144370,20161804,50501342364,565929674,2007245235,0002,1,PPDAP,PPLUS,DEACTIVE ,,, EN,N / A,214370,:IR_,N / A ,,,,,   SVR,473455,208082557,14501648637,2000553929280,2044108378,0002,1,3G,CODAP,INACTIVE ,,, EN,N / A,35455,:IR_,N / A ,,,,,

3 个答案:

答案 0 :(得分:0)

您只是打开gzip文件。我认为你期望打开的文件像迭代器一样自动运行。它做的。但是每一行都是一个文本字符串。写作者期望一个迭代器,每个项目都是一个用逗号分隔写入的值数组。因此给定一个迭代器,每个项目都是一个sting,并且给定一个字符串是一个字符数组,你得到你找到的结果。

由于你没有提到gzip数据行真正包含的内容,我无法猜测如何将行解析为合理的数组。但假设一个名为'split_line'的函数适合于你可以做的那些数据

with gzip.open('file.gz.Done', 'rb') as gzip_f:
  data = [split_line(l) for l in gzip_f]
  with open('output.csv', 'wb') as myFile:
    writer = csv.writer(myFile)
    writer.writerows(data)
    print("Writing complete")

当然,此时逐行进行并将有线放在一起是有道理的。

请参阅https://docs.python.org/2/library/csv.html

答案 1 :(得分:0)

我认为这只是因为gzip.open()将为您提供类似文件的对象,但csvwriter.writerows()需要一个字符串列表列表才能完成其工作。

但我不明白你为什么要使用csv模块。您似乎只想提取gzip文件的内容并将其保存在未压缩的输出文件中。你可以这样做:

import gzip

input_file_name = 'file.gz.DONE'
output_file_name = 'output.csv'

with gzip.open(input_file_name, 'rt') as input_file:
    with open('output.csv', 'wt') as output_file:
        for line in input_file:
            output_file.write(line)

print("Writing complete")

如果您想使用csv模块,因为您不确定您的输入数据是否格式正确(并且您想立即收到错误消息),那么您可以这样做:

import gzip
import csv

input_file_name = 'file.gz.DONE'
output_file_name = 'output.csv'

with gzip.open(input_file_name, 'rt', newline='') as input_file:
    reader_csv = csv.reader(input_file)
    with open('output.csv', 'wt', newline='') as output_file:
        writer_csv = csv.writer(output_file)
        writer_csv.writerows(reader_csv)

print("Writing complete")

这是你想要做的吗?这很难猜测,因为我们没有输入文件可供理解。

如果不是您想要的,您是否愿意澄清您想要的内容?

答案 2 :(得分:0)

由于我现在掌握了gzip文件本身就是逗号的信息,因此它简化了分离的值..

with gzip.open('file.gz.DONE', 'rb') as gzip_f, open('output.csv', 'wb') as myFile:
  myfile.write(gzip_f.read())

换句话说,它只是关于另一个档案的枪口的一轮。