Python大.tsv文件到.csv文件

时间:2018-02-01 16:50:44

标签: python csv

实际上,下面的代码可以成功地将.tsv文件转换为.csv文件,但是,当文件很大(超过1GB)时,它有一个MemoryErrorread函数。

import re
tsv = open('tsv.tsv', 'r')
fileContent =  tsv.read()
fileContent = re.sub("\t", ",", fileContent) # convert from tab to comma
csv_file = open("csv.csv", "w")
csv_file.write(fileContent)
csv_file.close()

我知道通过阅读大文件,我可以使用以下代码:

with open("data.txt") as myfile:
    for line in myfile:

但我不知道如何将这两个代码合并为一个并正常工作以将大尺寸.tsv文件转换为.csv文件

2 个答案:

答案 0 :(得分:2)

直接将两个片段粘在一起:

with open("data.txt", 'r') as myfile:
  with open("csv.csv", 'w') as csv_file:
    for line in myfile:
      fileContent = re.sub("\t", ",", line)
      csv_file.write(fileContent)

答案 1 :(得分:0)

对于大文件使用pandas,而不是纯Python:

import pandas as pd
dfs = pd.read_csv('file.tsv', sep='\t', chunksize=50)
for df in dfs:
    df.to_csv('file.csv', sep=',', mode='a')