我有两个数据帧,我想更新第一个。 df1包含不同的市场(M1,M2等)和每个市场的一些代码(数字或虚拟)
import pandas as pd
labels = ["Market","Code"]
values = [["M1","1234"],["M1","Dummy"],["M1","1234"],["M2","Dummy"],["M1","1234"]]
df = pd.DataFrame.from_records(values,columns=labels)
print(df)
Market Code
0 M1 1234
1 M1 Dummy
2 M1 1234
3 M2 Dummy
4 M1 1234
如果Code == Dummy,那么我想根据df2中针对此特定市场的值更新df中的代码。因此,每个市场都应该收到不同的新代码。
labels = ["Market","Code(New)"]
values = [["M1","4567"],["M2","5678"]]
df2 = pd.DataFrame.from_records(values,columns=labels)
print(df2)
Market Code(New)
0 M1 4567
1 M2 5678
最后我应该
labels = ["Market","Code"]
values = [["M1","1234"],["M1","4567"],["M1","1234"],["M2","5678"],["M1","1234"]]
df_clean = pd.DataFrame.from_records(values,columns=labels)
print(df_clean)
Market Code
0 M1 1234
1 M1 4567
2 M1 1234
3 M2 5678
4 M1 1234
答案 0 :(得分:3)
使用.merge
上的Market
和.loc
df.Code == 'Dummy'
值的子集
In [288]: df.loc[df.Code=='Dummy', 'Code'] = df.merge(df2, on='Market', how='left')['Code(New)']
In [289]: df
Out[289]:
Market Code
0 M1 1234
1 M1 4567
2 M1 1234
3 M2 5678
4 M1 1234
答案 1 :(得分:2)
基于您的示例
pd.concat([df1[df1.Code!='Dummy'],df2],axis=0)
我在发布答案后编辑了您的输入,下面是具有更新输入的解决方案。
df2.columns=["Market","Code"]
df2.index=df[df.Code=='Dummy'].index
pd.concat([df[df.Code!='Dummy'],df2],axis=0).sort_index()
Out[372]:
Market Code
0 M1 1234
1 M1 4567
2 M1 1234
3 M2 5678
4 M1 1234