NEWBIE使用PYTHON(2.7.9) - 当我使用以下命令将gzip压缩文件导出到csv时
myData = gzip.open('file.gz.DONE', 'rb')
myFile = open('output.csv', 'wb') with myFile:
writer = csv.writer(myFile)
writer.writerows(myData)
print("Writing complete")
它在csv中打印,并在每个字符中使用逗号分隔。例如。
S,V,R,","2,1,4,0,",",2,0,1,6,1,1,3,8,0,4,",",5,0,5,0,1,3,4,2,0,6,4,7,3,6,4,",",",",2,0,0,0,5,6,5,9,2,9,6,7,4,",",2,0,0,7,2,4,5,2,3,5,",",0,0,0,2,","
I,V,E,",",",",",",E,N,",",N,/,A,",",0,4,2,1,4,4,9,3,7,0,",":,I,R,_,",",N,/,A,",",U,N,A,N,S,W,",",",",",",",","
"
S,V,R,",",4,7,3,3,5,5,",",2,0,5,7,",",5,0,5,0,1,4,5,0,1,6,4,8,6,3,7,",",",",2,0,0,0,5,5,3,9,2,9,2,8,0,",",2,0,4,4,1,0,8,3,7,8,",",0,0,0,2,","
I,V,E,",",",",",",E,N,",",N,/,A,",",0,4,4,7,3,3,5,4,5,5,",",,:,I,R,_,",",N,/,A,",",U,N,A,N,S,W,",",",",",",",","
如何删除逗号以便使用正确的字段导出?例如。
SVR,2144370,20161804,50501342364,565929674,2007245235,0002,1,PPDAP,PPLUS,DEACTIVE ,,, EN,N / A,214370,:IR_,N / A ,,,,, SVR,473455,208082557,14501648637,2000553929280,2044108378,0002,1,3G,CODAP,INACTIVE ,,, EN,N / A,35455,:IR_,N / A ,,,,,
答案 0 :(得分:0)
您只是打开gzip文件。我认为你期望打开的文件像迭代器一样自动运行。它做的。但是每一行都是一个文本字符串。写作者期望一个迭代器,每个项目都是一个用逗号分隔写入的值数组。因此给定一个迭代器,每个项目都是一个sting,并且给定一个字符串是一个字符数组,你得到你找到的结果。
由于你没有提到gzip数据行真正包含的内容,我无法猜测如何将行解析为合理的数组。但假设一个名为'split_line'的函数适合于你可以做的那些数据
with gzip.open('file.gz.Done', 'rb') as gzip_f:
data = [split_line(l) for l in gzip_f]
with open('output.csv', 'wb') as myFile:
writer = csv.writer(myFile)
writer.writerows(data)
print("Writing complete")
当然,此时逐行进行并将有线放在一起是有道理的。
答案 1 :(得分:0)
我认为这只是因为gzip.open()
将为您提供类似文件的对象,但csvwriter.writerows()
需要一个字符串列表列表才能完成其工作。
但我不明白你为什么要使用csv
模块。您似乎只想提取gzip文件的内容并将其保存在未压缩的输出文件中。你可以这样做:
import gzip
input_file_name = 'file.gz.DONE'
output_file_name = 'output.csv'
with gzip.open(input_file_name, 'rt') as input_file:
with open('output.csv', 'wt') as output_file:
for line in input_file:
output_file.write(line)
print("Writing complete")
如果您想使用csv
模块,因为您不确定您的输入数据是否格式正确(并且您想立即收到错误消息),那么您可以这样做:
import gzip
import csv
input_file_name = 'file.gz.DONE'
output_file_name = 'output.csv'
with gzip.open(input_file_name, 'rt', newline='') as input_file:
reader_csv = csv.reader(input_file)
with open('output.csv', 'wt', newline='') as output_file:
writer_csv = csv.writer(output_file)
writer_csv.writerows(reader_csv)
print("Writing complete")
这是你想要做的吗?这很难猜测,因为我们没有输入文件可供理解。
如果不是您想要的,您是否愿意澄清您想要的内容?
答案 2 :(得分:0)
由于我现在掌握了gzip文件本身就是逗号的信息,因此它简化了分离的值..
with gzip.open('file.gz.DONE', 'rb') as gzip_f, open('output.csv', 'wb') as myFile:
myfile.write(gzip_f.read())
换句话说,它只是关于另一个档案的枪口的一轮。