python:pandas:根据索引表合并多个表

时间:2017-07-05 20:14:51

标签: python pandas merge

例如,我有三个表A,B,C

表A:

id1   value1
1     23
2     34
3     2342
4     333

表B:

id2   value2
1     apple
2     banana
3     berry

表C:

id3   value3   value4
1     red      batman
2     green    superman
3     white    wonder woman
4     gray     aquaman
5     yellow   flash

我想根据索引表D

合并这三个表

表D:

Table_A    Table_B    Table_C
1          3           2
3                      4       
2          2           3
4          1           1
                       5 

我的结果表应该是:

id1   value1    id2   value2    id3   value3  value4
1     23        3     berry     2     green    superman
3     2342                      4     gray     aquaman
2     34        2     banana    3     white    wonder woman
4     333       1     apple     1     red      batman
                                5     yellow   flash

我可以通过Python Pandas或者我需要在Spark中做到吗?

1 个答案:

答案 0 :(得分:0)

试试吧:

table_d['value1'] = table_d['Table_A'].map(table_a.set_index('id1')['value1'])

table_d['value2'] = table_d['Table_B'].map(table_b.set_index('id2')['value2'])

table_d.merge(table_c, left_on='Table_C', right_on='id3')

输出:

   Table_A  Table_B  Table_C  value1  value2  id3  value3        value4
0      1.0      3.0        2    23.0   berry    2   green      superman
1      3.0      NaN        4  2342.0     NaN    4    gray       aquaman
2      2.0      2.0        3    34.0  banana    3   white  wonder woman
3      4.0      1.0        1   333.0   apple    1     red        batman
4      NaN      NaN        5     NaN     NaN    5  yellow         flash