使用python 3将大型csv文件转换为excel

时间:2017-11-16 13:16:54

标签: python excel python-3.x csv xlsxwriter

这是我的代码隐藏csv文件到xlsx文件,对于小尺寸CSV文件这个代码工作正常,但是当我尝试更大尺寸的CSV文件时,它显示错误。

import os
import glob
import csv
from xlsxwriter.workbook import Workbook

for csvfile in glob.glob(os.path.join('.', 'file.csv')):
    workbook = Workbook(csvfile[:-4] + '.xlsx')
    worksheet = workbook.add_worksheet()
    with open(csvfile, 'r', encoding='utf8') as f:
        reader = csv.reader(f)
        for r, row in enumerate(reader):
            for c, col in enumerate(row):
                worksheet.write(r, c, col)
    workbook.close()

错误是

File "CsvToExcel.py", line 12, in <module>
for r, row in enumerate(reader):
_csv.Error: field larger than field limit (131072)
Exception ignored in: <bound method Workbook.__del__ of 
<xlsxwriter.workbook.Workbook object at 0x7fff4e731470>>
Traceback (most recent call last):
File "/usr/local/lib/python3.5/dist-packages/xlsxwriter/workbook.py", line 
153, in __del__
Exception: Exception caught in workbook destructor. Explicit close() may be 
required for workbook.

2 个答案:

答案 0 :(得分:0)

使用大文件时,最好使用&#39; constant_memory&#39;用于受控内存使用,如:

workbook = Workbook(csvfile + '.xlsx', {'constant_memory': True}).

参考:xlsxwriter.readthedocs.org/en/latest/working_with_memory.htm‌​l

答案 1 :(得分:0)

我发现了带有panda软件包的新代码,该代码现在可以正常工作

import pandas
data = pandas.read_csv('Documents_2/AdvMedcsv.csv') 
data = data.groupby(lambda x: data['research_id'][x]).first() 
writer = pandas.ExcelWriter('Documents_2/AdvMed.xlsx',engine='xlsxwriter')data.to_excel(writer) 
writer.save()