如果满足条件,则根据另一个数据帧中的匹配更新数据帧

时间:2017-08-14 14:30:15

标签: python pandas dataframe

我有两个数据帧,我想更新第一个。 df1包含不同的市场(M1,M2等)和每个市场的一些代码(数字或虚拟)

import pandas as pd
labels = ["Market","Code"]
values = [["M1","1234"],["M1","Dummy"],["M1","1234"],["M2","Dummy"],["M1","1234"]]
df = pd.DataFrame.from_records(values,columns=labels)
print(df)
 Market   Code
0     M1   1234
1     M1  Dummy
2     M1   1234
3     M2  Dummy
4     M1   1234

如果Code == Dummy,那么我想根据df2中针对此特定市场的值更新df中的代码。因此,每个市场都应该收到不同的新代码。

labels = ["Market","Code(New)"]
values = [["M1","4567"],["M2","5678"]]
df2 = pd.DataFrame.from_records(values,columns=labels)
print(df2)
 Market Code(New)
0     M1      4567
1     M2      5678

最后我应该

labels = ["Market","Code"]
values = [["M1","1234"],["M1","4567"],["M1","1234"],["M2","5678"],["M1","1234"]]
df_clean = pd.DataFrame.from_records(values,columns=labels)
print(df_clean)
  Market  Code
0     M1  1234
1     M1  4567
2     M1  1234
3     M2  5678
4     M1  1234

2 个答案:

答案 0 :(得分:3)

使用.merge上的Market.loc df.Code == 'Dummy'值的子集

In [288]: df.loc[df.Code=='Dummy', 'Code'] = df.merge(df2, on='Market', how='left')['Code(New)']

In [289]: df
Out[289]:
  Market  Code
0     M1  1234
1     M1  4567
2     M1  1234
3     M2  5678
4     M1  1234

答案 1 :(得分:2)

基于您的示例

pd.concat([df1[df1.Code!='Dummy'],df2],axis=0)

我在发布答案后编辑了您的输入,下面是具有更新输入的解决方案。

df2.columns=["Market","Code"]
df2.index=df[df.Code=='Dummy'].index
pd.concat([df[df.Code!='Dummy'],df2],axis=0).sort_index()


Out[372]: 
  Market  Code
0     M1  1234
1     M1  4567
2     M1  1234
3     M2  5678
4     M1  1234