Python需要将巨大的excel分成几个小的优秀。编码不起作用

时间:2018-01-10 01:26:39

标签: python excel

我有一个很好的excel,我需要把它分成小的(我正在使用python)。它应该每300行大excel并将它们写入小excels的前300行(每个小excel应该有300行或更少,最后一行)。大excel只是第一列中的元素excel(A:A)和大约8.000行(单元格包含电子邮件)。

代码如下:

from xlrd import open_workbook
import xlsxwriter


wb = open_workbook('BBDD_POLAROID_TOTAL.xlsx')
excel_num = 0
print('ARCHIVO: ' + str(excel_num))
workbook = xlsxwriter.Workbook('BBDD' + str(excel_num) + '.xlsx')
worksheet = workbook.add_worksheet()
for s in wb.sheets():
    number_of_rows = s.nrows
    for row in range(number_of_rows):
        if row % 300 == 0:
            print('close: ' + str(excel_num))
            workbook.close()
            excel_num += 1
            print('ARCHIVO: ' + str(excel_num))
            workbook = xlsxwriter.Workbook('BBDD' + str(excel_num) + '.xlsx')
            worksheet = workbook.add_worksheet()
            print('all good: ' + str(excel_num))
        print(str(row) + s.cell(row, 0).value)
        worksheet.write(row, 0, s.cell(row, 0).value)

我看不出为什么这段代码不起作用。它实际上包含了所有excel,但只写在第二个(第一个只是打开和关闭)。

感谢您的帮助!

3 个答案:

答案 0 :(得分:2)

以下是我在评论中提到的原因之一:

from xlrd import open_workbook
import xlsxwriter


wb = open_workbook('BBDD_POLAROID_TOTAL.xlsx')
excel_num = 0
print('ARCHIVO: ' + str(excel_num))
workbook = xlsxwriter.Workbook('BBDD' + str(excel_num) + '.xlsx')
worksheet = workbook.add_worksheet()

for s in wb.sheets():
    number_of_rows = s.nrows
    for row in range(0, number_of_rows):
        if row % 300 == 0:
            if row == 0:
                print(str(row) + s.cell(row, 0).value)
                worksheet.write(row%300, 0, s.cell(row, 0).value)
            else:
                print('close: ' + str(excel_num))
                workbook.close()
                excel_num += 1
                print('ARCHIVO: ' + str(excel_num))
                workbook = xlsxwriter.Workbook('BBDD' + str(excel_num) + '.xlsx')
                worksheet = workbook.add_worksheet()
                print('all good: ' + str(excel_num))
        print(str(row) + ' ' + s.cell(row, 0).value)
        worksheet.write(row%300, 0, s.cell(row, 0).value)
workbook.close()

答案 1 :(得分:1)

嗯,这段代码可以满足我的需求。但我仍然不知道其他代码错误的原因。如果有人有答案,我会很高兴。

PD:打印只是为了看文件是如何写的。

from xlrd import open_workbook
import xlsxwriter


wb = open_workbook('BBDD_POLAROID_TOTAL.xlsx')
archivo = [[]]
excel_num = 0

for s in wb.sheets():
    number_of_rows = s.nrows
    for row in range(number_of_rows):
        print(str(excel_num) + ' ' + str(row) + ' ' + s.cell(row, 0).value)
        archivo[excel_num].append(s.cell(row, 0).value)
        if row % 300 == 0 and row != 0:
            archivo.append([])
            excel_num += 1

for name in range(len(archivo)):
    workbook = xlsxwriter.Workbook('BBDD' + str(name) + '.xlsx')
    worksheet = workbook.add_worksheet()
    for mail_index in range(len(archivo[name])):
        print(str(name) + ' ' + str(mail_index) + ' ' + archivo[name][mail_index])
        worksheet.write(mail_index, 0, archivo[name][mail_index])
    workbook.close()

答案 2 :(得分:0)

**I have something better for columns also**



    # -*- coding: utf-8 -*-
"""
Created on Mon Jul 27 17:01:57 2020

@author: Vishal Yadav
"""

from xlrd import open_workbook
import xlsxwriter

file_path='E:\project\file_to_be_splitted.xls'
output_path='E:\project\splitter_output'
row_split_count=990

wb = open_workbook(file_path)
excel_num = 0
workbook = xlsxwriter.Workbook(output_path+'\\'+str(excel_num) + '.xlsx')
worksheet = workbook.add_worksheet()

for s in wb.sheets():
    number_of_rows = s.nrows
    number_of_cols = s.ncols
    print(number_of_rows)
    print(number_of_cols)
    for row in range(0, number_of_rows):
        for col in range(0, number_of_cols):
            if(row % row_split_count == 0 and row!=0 and row<=row_split_count):
                worksheet.write(row_split_count, col, s.cell(row, col).value)
                if(col == number_of_cols-1):
                    workbook.close()
                    excel_num=excel_num+1
                    workbook = xlsxwriter.Workbook(output_path+'\\'+str(excel_num) + '.xlsx')
                    worksheet = workbook.add_worksheet()
            elif(row % row_split_count == 0 and row!=0 and row>row_split_count):
                worksheet.write(row_split_count-1, col, s.cell(row, col).value)
                if(col == number_of_cols-1):
                    workbook.close()
                    excel_num=excel_num+1
                    workbook = xlsxwriter.Workbook(output_path+'\\'+str(excel_num) + '.xlsx')
                    worksheet = workbook.add_worksheet()
            elif(row<row_split_count):
                worksheet.write(row % row_split_count, col, s.cell(row, col).value)
            else:
                worksheet.write((row % row_split_count)-1, col, s.cell(row, col).value)
    workbook.close()