我有一个很好的excel,我需要把它分成小的(我正在使用python)。它应该每300行大excel并将它们写入小excels的前300行(每个小excel应该有300行或更少,最后一行)。大excel只是第一列中的元素excel(A:A)和大约8.000行(单元格包含电子邮件)。
代码如下:
from xlrd import open_workbook
import xlsxwriter
wb = open_workbook('BBDD_POLAROID_TOTAL.xlsx')
excel_num = 0
print('ARCHIVO: ' + str(excel_num))
workbook = xlsxwriter.Workbook('BBDD' + str(excel_num) + '.xlsx')
worksheet = workbook.add_worksheet()
for s in wb.sheets():
number_of_rows = s.nrows
for row in range(number_of_rows):
if row % 300 == 0:
print('close: ' + str(excel_num))
workbook.close()
excel_num += 1
print('ARCHIVO: ' + str(excel_num))
workbook = xlsxwriter.Workbook('BBDD' + str(excel_num) + '.xlsx')
worksheet = workbook.add_worksheet()
print('all good: ' + str(excel_num))
print(str(row) + s.cell(row, 0).value)
worksheet.write(row, 0, s.cell(row, 0).value)
我看不出为什么这段代码不起作用。它实际上包含了所有excel,但只写在第二个(第一个只是打开和关闭)。
感谢您的帮助!
答案 0 :(得分:2)
以下是我在评论中提到的原因之一:
from xlrd import open_workbook
import xlsxwriter
wb = open_workbook('BBDD_POLAROID_TOTAL.xlsx')
excel_num = 0
print('ARCHIVO: ' + str(excel_num))
workbook = xlsxwriter.Workbook('BBDD' + str(excel_num) + '.xlsx')
worksheet = workbook.add_worksheet()
for s in wb.sheets():
number_of_rows = s.nrows
for row in range(0, number_of_rows):
if row % 300 == 0:
if row == 0:
print(str(row) + s.cell(row, 0).value)
worksheet.write(row%300, 0, s.cell(row, 0).value)
else:
print('close: ' + str(excel_num))
workbook.close()
excel_num += 1
print('ARCHIVO: ' + str(excel_num))
workbook = xlsxwriter.Workbook('BBDD' + str(excel_num) + '.xlsx')
worksheet = workbook.add_worksheet()
print('all good: ' + str(excel_num))
print(str(row) + ' ' + s.cell(row, 0).value)
worksheet.write(row%300, 0, s.cell(row, 0).value)
workbook.close()
答案 1 :(得分:1)
嗯,这段代码可以满足我的需求。但我仍然不知道其他代码错误的原因。如果有人有答案,我会很高兴。
PD:打印只是为了看文件是如何写的。
from xlrd import open_workbook
import xlsxwriter
wb = open_workbook('BBDD_POLAROID_TOTAL.xlsx')
archivo = [[]]
excel_num = 0
for s in wb.sheets():
number_of_rows = s.nrows
for row in range(number_of_rows):
print(str(excel_num) + ' ' + str(row) + ' ' + s.cell(row, 0).value)
archivo[excel_num].append(s.cell(row, 0).value)
if row % 300 == 0 and row != 0:
archivo.append([])
excel_num += 1
for name in range(len(archivo)):
workbook = xlsxwriter.Workbook('BBDD' + str(name) + '.xlsx')
worksheet = workbook.add_worksheet()
for mail_index in range(len(archivo[name])):
print(str(name) + ' ' + str(mail_index) + ' ' + archivo[name][mail_index])
worksheet.write(mail_index, 0, archivo[name][mail_index])
workbook.close()
答案 2 :(得分:0)
**I have something better for columns also**
# -*- coding: utf-8 -*-
"""
Created on Mon Jul 27 17:01:57 2020
@author: Vishal Yadav
"""
from xlrd import open_workbook
import xlsxwriter
file_path='E:\project\file_to_be_splitted.xls'
output_path='E:\project\splitter_output'
row_split_count=990
wb = open_workbook(file_path)
excel_num = 0
workbook = xlsxwriter.Workbook(output_path+'\\'+str(excel_num) + '.xlsx')
worksheet = workbook.add_worksheet()
for s in wb.sheets():
number_of_rows = s.nrows
number_of_cols = s.ncols
print(number_of_rows)
print(number_of_cols)
for row in range(0, number_of_rows):
for col in range(0, number_of_cols):
if(row % row_split_count == 0 and row!=0 and row<=row_split_count):
worksheet.write(row_split_count, col, s.cell(row, col).value)
if(col == number_of_cols-1):
workbook.close()
excel_num=excel_num+1
workbook = xlsxwriter.Workbook(output_path+'\\'+str(excel_num) + '.xlsx')
worksheet = workbook.add_worksheet()
elif(row % row_split_count == 0 and row!=0 and row>row_split_count):
worksheet.write(row_split_count-1, col, s.cell(row, col).value)
if(col == number_of_cols-1):
workbook.close()
excel_num=excel_num+1
workbook = xlsxwriter.Workbook(output_path+'\\'+str(excel_num) + '.xlsx')
worksheet = workbook.add_worksheet()
elif(row<row_split_count):
worksheet.write(row % row_split_count, col, s.cell(row, col).value)
else:
worksheet.write((row % row_split_count)-1, col, s.cell(row, col).value)
workbook.close()