我有两个不同尺寸的数据框。
仅当df2
和df1
的列值[UserId,Month]匹配时,才需要从df2
更新df1中的msg_count
我的数据如下:
df1:
UserID Month A B C D E F msg_count
knaas 1/1/2017 0 0 0 0 0 0 0
knaas 2/1/2017 0 0 0 0 0 0 0
knaas 3/1/2017 0 0 0 0 0 0 0
knaas 4/1/2017 0 0 0 2 0 0 0
knaas 5/1/2017 0 0 0 0 0 0 0
knaas 6/1/2017 0 0 0 0 0 0 0
knaas 7/1/2017 0 0 0 0 0 0 0
knaas 8/1/2017 0 0 0 0 0 0 0
knaas 9/1/2017 0 0 0 0 0 0 0
knaas 10/1/2017 0 0 0 0 0 0 0
knaas 11/1/2017 0 0 0 0 0 0 0
knaas 12/1/2017 0 0 0 0 0 0 0
ArtCort0324 1/1/2017 0 0 0 0 0 0 0
ArtCort0324 2/1/2017 0 2 0 2 0 0 0
ArtCort0324 3/1/2017 0 0 0 0 0 0 0
ArtCort0324 4/1/2017 0 1 1 0 0 0 0
ArtCort0324 5/1/2017 0 0 0 3 0 0 0
ArtCort0324 6/1/2017 0 0 0 0 0 0 9
df2:
UserID Month msg_count
ArtCort0324 1/1/2017 0
ArtCort0324 2/1/2017 0
ArtCort0324 3/1/2017 0
ArtCort0324 4/1/2017 0
ArtCort0324 5/1/2017 0
ArtCort0324 6/1/2017 9
ArtCort0324 7/1/2017 0
ArtCort0324 8/1/2017 0
ArtCort0324 9/1/2017 0
ArtCort0324 10/1/2017 0
ArtCort0324 11/1/2017 0
ArtCort0324 12/1/2017 0
我尝试了以下代码片段。但是它没有按预期工作
res = df2.set_index(['UserID', 'Month'])\
.combine_first(df1.set_index(['UserID', 'Month']))\
.reset_index()
updated_new = df1.merge(gitter, how='left', on=['UserID', 'Month'],
suffixes=('', '_new'))
updated_new['msg_count'] =
np.where(pd.notnull(updated_new['msg_count_new']),
updated_new['msg_count_new'], updated_new['msg_count'])
我需要以下输出
UserID Month A B C D E F msg_count
knaas 1/1/2017 0 0 0 0 0 0 0
knaas 2/1/2017 0 0 0 0 0 0 0
knaas 3/1/2017 0 0 0 0 0 0 0
knaas 4/1/2017 0 0 0 2 0 0 0
knaas 5/1/2017 0 0 0 0 0 0 0
knaas 6/1/2017 0 0 0 0 0 0 0
knaas 7/1/2017 0 0 0 0 0 0 0
knaas 8/1/2017 0 0 0 0 0 0 0
knaas 9/1/2017 0 0 0 0 0 0 0
knaas 10/1/2017 0 0 0 0 0 0 0
knaas 11/1/2017 0 0 0 0 0 0 0
knaas 12/1/2017 0 0 0 0 0 0 0
ArtCort0324 1/1/2017 0 0 0 0 0 0 0
ArtCort0324 2/1/2017 1 0 0 0 0 0 0
ArtCort0324 3/1/2017 0 0 0 0 0 0 50
ArtCort0324 4/1/2017 0 0 0 0 0 0 0
我已向msg_count
添加了默认列df1
,其默认值为0。
仅当两个数据帧中的msg_count
和df1
相等时,我才需要用msg_count
中的df2
的值来更新UserId
中的Month
/ p>
答案 0 :(得分:0)
听起来您想要merge
:
df_merge = pd.merge(left=df1, right=df2, on=['UserID', 'Month'], how='left']
您可能希望将其设置为'inner', 'outer'
等...