如何合并2个df,1列以匹配2列??
数据混乱了,对于某些行,有id名称而不是id。
如果我想将1列合并为1列,或将2列合并为2列,而不是将1列合并为2列,则可以使用
Reff表
g_spend =
campaignid id_name cost
154 campaign1 15
155 campaign2 12
1566 campaign33 12
158 campaign4 33
数据
cw =
campaignid
154
154
155
campaign1
campaign33
1566
158
campaign1
campaign1
campaign33
campaign4
所需的输出
g_spend =
campaignid id_name cost leads
154 campaign1 15 5
155 campaign2 12 0
1566 campaign33 12 3
158 campaign4 33 2
我做了什么。
# Just work for one column
cw.head()
grouped_cw = cw.groupby(["campaignid"]).count()
grouped_cw.rename(columns={'reach':'leads'}, inplace=True)
grouped_cw = pd.DataFrame(grouped_cw)
# now merging
g_spend.campaignid = g_spend.campaignid.astype(str)
g_spend = g_spend.merge(grouped_cw, left_on='campaignid', right_index=True)
答案 0 :(得分:1)
我首先将id_name
设置为g_spend
的索引,然后对replace
进行cw
,然后再进行value_counts
:
s = (cw.campaignid
.replace(g_spend.set_index('id_name').campaignid
.value_counts()
.to_frame('leads')
)
g_spend = g_spend.merge(s, left_on='campaignid', right_index=True)
输出:
campaignid id_name cost leads
0 154 campaign1 15 5
1 155 campaign2 12 1
2 1566 campaign33 12 3
3 158 campaign4 33 2