我有一个这种格式的Excel表格
DATE AREA BEAU EFFORT SEASON VESSEL P/S
0 2016-04-01 SE LANTAU 1 16.24 SPRING STANDARD31516 P
1 2016-04-01 SE LANTAU 2 10.23 SPRING STANDARD31516 P
2 2016-04-01 SE LANTAU 1 4.82 SPRING STANDARD31516 S
3 2016-04-01 SE LANTAU 2 2.98 SPRING STANDARD31516 S
4 2016-04-01 LAMMA 1 2.92 SPRING STANDARD31516 P
我得到了另一张这种格式的excel表
DATE STG # TIME HRD SZ AREA BEAU PSD EFFORT TYPE NORTHING EASTING SEASON BOATASSOC.P/S
0 2016-04-06 1 1025 12 W LANTAU 2 58 ON HKCRP 813713 802792 SPRING NONE S
1 2016-04-06 2 1113 3 W LANTAU 4 27 ON HKCRP 806173 802043 SPRING NONE S
2 2016-04-06 3 1345 2 SW LANTAU 2 ND OFF HKCRP 805606 803300 SPRING NONE NaN
3 2016-04-14 1 1613 4 W LANTAU 2 74 ON HKCRP 808800 800864 SPRING NONE S
4 2016-04-20 1 1339 4 W LANTAU 3 ND OFF HKCRP 805930 801929 SPRING NONE NaN
如果DATE,AREA,BEAU和P / S在两个表之间匹配,我想在第一个表的EFFORT列中添加数字到表2。
我应该加入,合并或映射这两个表吗?
答案 0 :(得分:1)
您可以同时使用合并和加入
第一组by col1,col2和最后一列
# My assumption is col1 + col2 + col_last when used as index cannot be duplicated
temp = df_2.groupby(['col1','col2','col_last']).first()
# df is the dataframe in which you want the extra column
df = df.merge(temp,left_on=['col1','col2','col_last'],right_index=True,how='left')