如何将管道分离的文本文件转换为CSV?

时间:2015-03-23 14:34:22

标签: python csv

我有一个大文本文件,我想使用Python将其转换为CSV。我的数据如下:

var1|var2|var3|tonumber|fromnumber|var|coding|udh|var|circle|var|var|var|var15

898980d1-6e5b-40f2-a313-c30f08bf0fe6|49A5919EB0D04EDE9B6CEB5AF932EAA3|sbs1|919899980898|HITECH|1|1|0|VODAFONE|Delhi|2015-02-21 12:08:51|5|3|RBA/6724R # Kailash Ram Panwar (PL) # Rz-410/13 Flat No-09 Iiird Floor Tkd Extn Delhi - 110019-110019 # Tgt Skt #  #

如何将此文件转换为CSV?我试过了:

In [1]: import csv

In [2]: import pandas as pd

In [3]: piperows = []  

f = open("/home/suri/ValueFirst/MT.txt", "rb")

In [6]: readerpipe = csv.reader(f, delimiter = '|')

In [7]: for row in readerpipe: 
   ...:     piperows.append(row)
   ...:     f.close()  
   ...:  

我收到以下错误:

----------------------------------------------------
ValueError                      Traceback (most recent call last)  
<ipython-input-7-842b0d42f436> in <module>()  
----> 1 for row in readerpipe:  
      2     piperows.append(row)  
      3     f.close()  
      4   

ValueError: I/O operation on closed file  

2 个答案:

答案 0 :(得分:6)

像@Martijn Pieters建议的那样,你不应该以这种方式缩进f.close(),因为它现在是循环的一部分。我建议使用with块,它会自动关闭文件。

import csv

with open("/home/suri/ValueFirst/MT.txt", "rb") as f:
    readerpipe = csv.reader(f, delimiter='|')
    piperows = list(readerpipe)

这里有一件事是我们构建了一个包含所有行的大列表,如果要转换文件,这可能是一个坏主意。在读取管道分隔版本时,您可能会编写新的逗号分隔版本。

import csv

with open("/home/suri/ValueFirst/MT.txt", "rb") as file_pipe:
    reader_pipe = csv.reader(file_pipe, delimiter='|')
    with open("/home/suri/ValueFirst/MT.csv", 'wb') as file_comma:
        writer_comma = csv.writer(file_comma, delimiter=',')
        for row in reader_pipe:
            writer_comma.writerow(row)

编辑: @Martijn建议将读者直接传递给作家的writerows方法......如果那样{{ 1}}方法正确实现它将具有相同的效果,并避免一次加载内存中的所有行。

writerows

编辑2: 代码变得如此简单,以至于您可以内联读取器和编写器变量并获得以下内容,如果您愿意...

import csv

with open("/home/suri/ValueFirst/MT.txt", "rb") as file_pipe:
    reader_pipe = csv.reader(file_pipe, delimiter='|')
    with open("/home/suri/ValueFirst/MT.csv", 'wb') as file_comma:
        writer_comma = csv.writer(file_comma, delimiter=',')
        writer_comma.writerows(reader_pipe)

答案 1 :(得分:2)

您在阅读第一行后关闭文件:

for row in readerpipe: 
    piperows.append(row)
    f.close()  

从循环中删除f.close()行。

更好的是,将该文件用作上下文管理器,以便它自动关闭。您只需在阅读器上调用list()即可生成输出列表:

with open("/home/suri/ValueFirst/MT.txt", "rb") as f:
    readerpipe = csv.reader(f, delimiter = '|')
    piperows = list(readerpipe)

但要转换文件,您可以直接将readerpipe传递到writer.writerows()来电:

with open("/home/suri/ValueFirst/MT.txt", "rb") as f:
    readerpipe = csv.reader(f, delimiter = '|')
    with open("/home/suri/ValueFirst/MT.txt", "wb") as outputfile:
        writer = csv.writer(outputfile)
        writer.writerows(readerpipe)