比较两个电子表格文件并提取加工匹配数据的最简单,最快捷的方法是什么?

时间:2017-08-30 22:35:20

标签: python linux excel spreadsheet

我有两张电子表格,电子表格1和电子表格2.我需要从电子表格2中提取与电子表格1匹配的数据(行)。理想情况下,我需要从电子表格2中获取具有匹配网站名称的ID到电子表格1。

Spreadsheet 1:

Site Name : 
10410_DL01_Patels_Foodmarket      
10700_DL01_CD_Toronta 
110190_DL13__Jonny_Mall 
110300_DL13_Ezy_Mart    
CONTINUED


Spreadsheet 2:

ID         Site Name                         Address     Upgrade
10747     10410_DL01_Patels_Foodmarket       *********   *********
32544     104658_D_Torano_fedf               ********    *********
84562     103894_Girngsdfj                   ********    ********   
10727     10700_DL01_CD_Toronta              ********    *********
42344     104658_D_Torano_fedf               ********    *********
65465     103894_Girngsdfj                   ********    ********   
32544     104658_D_Torano_fedf               ********    *********
84562     103894_Girngsdfj                   ********    ********   
10838     110190_DL13__Jonny_Mall            ********    *********
10487     110300_DL13_Ezy_Mart               ********    *********
CONTINUED

1 个答案:

答案 0 :(得分:0)

这可能有效,使用xlrd包:

import xlrd

# open the two spreadsheets
b1 = xlrd.open_workbook("Spreadsheet 1.xlsx")
b2 = xlrd.open_workbook("Spreadsheet 2.xlsx")

# get the first sheet from each spreadsheet
sh1 = b1.sheet_by_index(0) 
sh2 = b2.sheet_by_index(0)

# for each sheet, read each row 
for rx1 in range(sh1.nrows):
    for rx2 in range(sh2.nrows):
        # find cell values in spreadsheet 2's column 2 that 
        #   match the cell value in spreadsheet 1's column 1   
        if sh1.row(rx1)[0].value == sh2.row(rx2)[1].value:
            # print the `id` from spreadsheet 2 for this matching row
            print(sh2.row(rx2)[0].value)

输出:

10747
10727
10838
10487

令人惊讶的是它相当短,用9行代码完成。 希望这会有所帮助。