尝试逐行阅读电子表格并写入excel(下采样)

时间:2016-09-22 13:33:43

标签: python excel

我正在尝试编写一些代码来对非常大的ex​​cel文件进​​行下采样。它需要准确复制前4行,然后在第5行开始每40行。我目前有这个

import os
import string
import shutil
import datetime

folders = os.listdir('./')
names = [s for s in folders if "csv" in s]
zips = [s for s in folders if "zip" in s]
for folder in names:
    filename = folder
    shutil.move(folder, './Archive')
    with open(filename) as f:
        counter = 0
        for line in f:
            counter += 1
            f_out = open('./DownSampled/' + folder +  '.csv', 'w')
            if counter < 5:
                f_out.write(line)
            elif (counter+35) % 40 == 0:
                f_out.write(line)
            f_out.close()

它将文件移动到Archive文件夹,但是没有创建一个缩减版本的版本,我在这里做错了什么想法?

1 个答案:

答案 0 :(得分:1)

您在前一个文件的每次迭代中都会覆盖该文件。将open(...)移出for循环:

with open(filename) as f, open('./DownSampled/' + folder +  '.csv', 'w') as f_out:
     for i, line in enumerate(f):
         if i < 5:
             f_out.write(line)
         elif (i+35) % 40 == 0:
             f_out.write(line)

更重要的是,custom history tables for logging in SQL Server可以取代您的计数逻辑。