Pandas中的Excel索引和匹配

时间:2017-12-12 08:44:58

标签: python pandas

我有一个这种格式的Excel表格

    DATE       AREA        BEAU EFFORT  SEASON  VESSEL         P/S
0   2016-04-01  SE LANTAU   1   16.24   SPRING  STANDARD31516   P
1   2016-04-01  SE LANTAU   2   10.23   SPRING  STANDARD31516   P
2   2016-04-01  SE LANTAU   1   4.82    SPRING  STANDARD31516   S
3   2016-04-01  SE LANTAU   2   2.98    SPRING  STANDARD31516   S
4   2016-04-01     LAMMA    1   2.92    SPRING  STANDARD31516   P

我得到了另一张这种格式的excel表

    DATE      STG # TIME    HRD SZ  AREA  BEAU PSD EFFORT TYPE NORTHING EASTING SEASON BOATASSOC.P/S
0   2016-04-06  1   1025    12  W LANTAU    2   58  ON  HKCRP   813713  802792  SPRING  NONE    S
1   2016-04-06  2   1113    3   W LANTAU    4   27  ON  HKCRP   806173  802043  SPRING  NONE    S
2   2016-04-06  3   1345    2   SW LANTAU   2   ND  OFF HKCRP   805606  803300  SPRING  NONE    NaN
3   2016-04-14  1   1613    4   W LANTAU    2   74  ON  HKCRP   808800  800864  SPRING  NONE    S
4   2016-04-20  1   1339    4   W LANTAU    3   ND  OFF HKCRP   805930  801929  SPRING  NONE    NaN

如果DATE,AREA,BEAU和P / S在两个表之间匹配,我想在第一个表的EFFORT列中添加数字到表2。

我应该加入,合并或映射这两个表吗?

1 个答案:

答案 0 :(得分:1)

您可以同时使用合并和加入

第一组by col1,col2和最后一列

# My assumption is col1 + col2 + col_last when used as index cannot be duplicated
temp = df_2.groupby(['col1','col2','col_last']).first()
# df is the dataframe in which you want the extra column
df = df.merge(temp,left_on=['col1','col2','col_last'],right_index=True,how='left')