匹配一个表和映射值到pandas python中的其他值

时间:2017-06-22 12:10:42

标签: python pandas dictionary dataframe

我有两个pandas数据帧: DF1:

LT     route_1 c2
PM/2     120   44
PM/52    110   49
PM/522   103   51
PM/522   103   51
PM/24    105   48
PM/536   109   67
PM/536   109   67
PM/5356  112   144 

DF2:

LT       W_ID 
PM/2     120.0
PM/52    110.0
PM/522   103.0
PM/522   103.0
PM/24    105.0
PM/536   109.0
PM/536   109.0
PM/5356  112.0

我需要将df2中的W_ID从df1映射到route_1,以清除,替换,但是来自一个表的LT需要匹配来自另一个表的LT。 期望的输出:

LT     route_1   c2
PM/2     120.0   44
PM/52    110.0   49
PM/522   103.0   51
PM/522   103.0   51
PM/24    105.0   48
PM/536   109.0   67
PM/536   109.0   67
PM/5356  112.0   144 

1 个答案:

答案 0 :(得分:1)

我认为map应该有效:

df1['route_1'] = df1['LT'].map(df2.set_index('LT')['W_ID'])

不幸的是没有:

  

InvalidIndexError:重新索引仅对具有唯一值的索引对象有效

编辑:

问题在于duplicates列中的LT。解决方案是cumcountmerge添加唯一left join的辅助列:

df1['g'] = df1.groupby('LT').cumcount()
df2['g'] = df2.groupby('LT').cumcount()
df = pd.merge(df1, df2, on=['LT','g'], how='left')
print (df)
        LT  route_1   c2  g   W_ID
0     PM/2      120   44  0  120.0
1    PM/52      110   49  0  110.0
2   PM/522      103   51  0  103.0
3   PM/522      103   51  1  103.0
4    PM/24      105   48  0  105.0
5   PM/536      109   67  0  109.0
6   PM/536      109   67  1  109.0
7  PM/5356      112  144  0  112.0

df1['route_1'] = df['W_ID']
df1.drop('g', axis=1, inplace=True)
print (df1)
        LT  route_1   c2
0     PM/2    120.0   44
1    PM/52    110.0   49
2   PM/522    103.0   51
3   PM/522    103.0   51
4    PM/24    105.0   48
5   PM/536    109.0   67
6   PM/536    109.0   67
7  PM/5356    112.0  144

类似的解决方案:

df1['g'] = df1.groupby('LT').cumcount()
df2['g'] = df2.groupby('LT').cumcount()
df = pd.merge(df1, df2, on=['LT','g'], how='left')
       .drop(['g', 'route_1'], axis=1)
       .rename(columns={'W_ID':'route_1'})
       .reindex_axis(['LT', 'route_1', 'c2'], axis=1)
print (df)
        LT  route_1   c2
0     PM/2    120.0   44
1    PM/52    110.0   49
2   PM/522    103.0   51
3   PM/522    103.0   51
4    PM/24    105.0   48
5   PM/536    109.0   67
6   PM/536    109.0   67
7  PM/5356    112.0  144