pyexcel - 获取列名并将整行添加到一起

时间:2016-07-21 04:00:36

标签: python python-3.x pyexcel

我有几个speadsheets:

Sheet 1     sheet2       sheet3

A B C     D E F     D F G

1 2 3      4 5 6     7 9 8

我使用pyexcel从电子表格1和2以及1和3连接在一起,因此1和2的组合行将是:

A B C D E F D F G,
1 2 3 4 5 6  

和1和3:

A B C D E F D F G
1 2 3       7 9 8

如何在pyexcel中完成?

现在我有两个for循环,这个:

 if t_row['name'] is not "":
                    update_sheet[count, 'name'] = t_row['name']

但是表2没有F和G列,而表3没有E和F.如何列出工作表中的列或者只需要整行并将其与行连接并存储?

1 个答案:

答案 0 :(得分:1)

目前尚不清楚:

  1. 您如何阅读工作表
  2. 当两个工作表都有值时,您希望如何处理连接。我想你想总结一下。

    import numpy as np
    import pyexcel as pe
    
    a = np.array(pe.get_array(file_name='Sheet1.xlsx'))
    b = np.array(pe.get_array(file_name='Sheet2.xlsx'))
    c = np.array(pe.get_array(file_name='Sheet3.xlsx'))
    
    all=[a,b,c]
    max_cols = max([i.shape[1] for i in all])
    
    
    for i in range(3):
        if all[i].dtype!=np.dtype('int'): 
            all[i][all[i]=='']=0
            all[i]=all[i].astype('int')
        if (all[i].shape[1] != max_cols):
            all[i]=np.hstack([all[i], [[0]*(max_cols-all[i].shape[1])]*(all[i].shape[0])])
    
    np.sum(np.vstack(all), 0)
    
  3. 修改

    使用你将不需要for循环(仅用于循环遍历不同的工作表)。这将以pythonic方式使用numpy!

    def join_sheets(a, b):
        both = [a,b]
        max_cols = max([i.number_of_columns() for i in both])
        min_rows = min([i.number_of_rows() for i in both])
        both_arr = [np.array(i.array) for i in both]
        for i in range(2):
            both_arr[i] = np.hstack([both_arr[i], [['']*(max_cols - both_arr[i].shape[1])]*(both_arr[i].shape[0])])
        both_arr[0][0:min_rows,][both_arr[1][0:min_rows,]!=''] = both_arr[1][0:min_rows,][both_arr[1][0:min_rows,]!='']
        if (b.number_of_rows() > min_rows):
            both_arr[0] = np.vstack([both_arr[0], both_arr[1][min_rows:,]])
        a.array = both_arr[0].tolist()
    
    sheets = pe.get_book(file_name='Sheet1.xlsx')
    for i in range(1, sheets.number_of_sheets()): join_sheets(sheets[0], sheets[i])
    sheets.save_as(sheets.path + '/' + sheets.filename)