使用python:xlrd和xlsxwriter

时间:2016-03-12 22:21:53

标签: python excel xlrd xlsxwriter

我正在尝试将Excel工作表的整个部分复制到另一个文件。 该段实际上是一个标题/描述,主要描述文件的属性,创建日期等... 所有这些都需要前五行和前三列的一些单元格,比如A1:C3。 这是我编写的代码(例如,仅为3行):

import xlsxwriter
import xlrd


#### open original excelbook
workbook = xlrd.open_workbook('hello.xlsx')
sheet = workbook.sheet_by_index(0)
# list of populated header rows
row_header_list = ['A1','A2','A3','A4','A5']
i = 0
c = 0
while c <= 2:
#### read original xcel book 3 rows by loop - counter is futher below
         data = [sheet.cell_value(c, col) for col in range(sheet.ncols)]
         #print data
#### write rows to the new excel book

         workbook = xlsxwriter.Workbook('tty_header.xlsx')
         worksheet = workbook.add_worksheet()
         worksheet.write_row(row_header_list[i], data)
         print i,c,row_header_list[i], data
         i+=1
         c+=1
         print "new i is", i, "new c is", c, "list value", row_header_list[i],"data is", data
         workbook.close()

计数器,数据,列表值 - 根据打印命令,一切似乎都是正确的,但是,当我运行此代码时,在新创建的文件中只有3行被填充,第1行和2是空的。不明白为什么...... 为了测试这个问题,做了另一个例子 - 一个非常不优雅的例子 - 没有循环,控制列表等 - 只是直言不讳的方法:

import xlsxwriter
import xlrd

# open original excelbook
workbook = xlrd.open_workbook('hello.xlsx')
sheet = workbook.sheet_by_index(0)
data1 = [sheet.cell_value(0, col) for col in range(sheet.ncols)]
data2 = [sheet.cell_value(1, col) for col in range(sheet.ncols)]
data3 = [sheet.cell_value(2, col) for col in range(sheet.ncols)]
data4 = [sheet.cell_value(3, col) for col in range(sheet.ncols)]

### new excelbook
workbook = xlsxwriter.Workbook('tty_header2.xlsx')
worksheet = workbook.add_worksheet()
worksheet.write_row('A1', data1)
worksheet.write_row('A2', data2)
worksheet.write_row('A3', data3)
worksheet.write_row('A4', data4)

workbook.close()

在这种情况下,一切都很顺利,所有需要的数据都已转移。 任何人都可以解释我第一个出了什么问题?谢谢。

我遇到的其他问题是,如果我在放置标题后开始填充列,则标题值将变为NULL。尽管如此,从#34;标题&#34;下面的单元格开始列填充。单元格(在代码中,我提供在它下面的第1列,从单元格6开始。有关如何解决它的任何想法?

workbook = xlrd.open_workbook('tty_header2.xlsx.xlsx')
sheet = workbook.sheet_by_index(0)

data = [sheet.cell_value(row, 2) for row in range(23, sheet.nrows)]
print  data

##### writing new file with xlswriter 
workbook = xlsxwriter.Workbook('try2.xlsx')
worksheet = workbook.add_worksheet('A')
worksheet.write_column('A6', data)
workbook.close()

更新:在Mike的更正之后,修改后的代码在这里:

import xlsxwriter
import xlrd


# open original excelbook and access first sheet
workbook = xlrd.open_workbook('hello_.xlsx')
sheet = workbook.sheet_by_index(0)

# define description rows
row_header_list = ['A1','A2','A3','A4','A5']
i = 0
c = 0

#create second file, add first sheet
workbook2 = xlsxwriter.Workbook('try2.xlsx')
worksheet = workbook2.add_worksheet('A')

# read original xcel book 5 rows by loop - counter is futher below
while c <= 5:

         data = [sheet.cell_value(c, col) for col in range(1,5)]
#print data


# write rows to the new excel book

         worksheet.write_row(row_header_list[i], data)
#   print "those are initial values",i,c,row_header_list[i], data
         i+=1
         c+=1
#  print "new i is", i, "new c is", c, "list value", row_header_list[i],"data is", data



####### works !!! xlrd - copy some columns, disclaiming 23 first rows and writing data to the new file


columnB_data = [sheet.cell_value(row, 2) for row in range(23, 72)]
print  columnB_data

##### writing new file with xlswriter - works, without (!!!) converting data to tuple
worksheet.write_column('A5', columnB_data)

columnG_data = [sheet.cell_value(row, 6) for row in range(23, 72)]
#worksheet = workbook.add_worksheet('B')
print columnG_data
worksheet.write_column('B5', columnG_data)

worksheet = workbook.add_worksheet('C')
columnC_dta = [sheet.cell_value(row, 7) for row in range(23, 72)]
print columnC_dta
worksheet.write_column('A5', columnC_dta)

#close workbook2
workbook2.close()

运行此操作后,我收到以下错误&#34; Traceback(最近一次调用最后一次):   文件&#34; C:/Users/Michael/PycharmProjects/untitled/cleaner.py" ;,第28行,在     worksheet.write_row(row_header_list [i],data) IndexError:列表索引超出范围 异常异常:异常(&#39;在工作簿析构函数中捕获异常。工作簿可能需要显式关闭()。&#39;,)&gt;忽略&#34 ;. &#34;第28行&#34;指:

worksheet.write_row(row_header_list[i], data)

从开始运行整个段到完成循环似乎很好并提供正确的输出,因此问题在于下面。 如果我按照建议使用显式关闭方法,我将无法再次使用add_sheet方法,因为它将在我当前的工作表上运行。在提供的文档中有&#34; sheet.activate&#34;和&#34; sheet.select&#34;方法,但它们似乎是为了美容改善的原因。我试图将xlsxwriter的工作放在一个不同的变量中(尽管如果我把所有的&#34;复制&#34;处理放在顶部,我不会&#34;工作簿&#34; ;被碾过) - 没有帮助

1 个答案:

答案 0 :(得分:1)

在每个循环中创建具有相同名称的新输出文件:

while c <= 2:
     #...
     workbook = xlsxwriter.Workbook('tty_header.xlsx')
     worksheet = workbook.add_worksheet()

因此,您在每个循环中覆盖该文件,只保存最后一行。

将其移出循环:

workbook = xlsxwriter.Workbook('tty_header.xlsx')
worksheet = workbook.add_worksheet()
while c <= 2:
     #...

workbook.close()