我有2个数据帧,我希望以下列方式组合: DF1:
I A B C
0 0.719391 0.091693 one
1 0.951499 0.83716 one
2 0.975212 0.224855 one
3 0.80762 0.031284 three
4 0.63319 0.342889 one
5 0.075102 0.899291 two
6 0.502843 0.773424 two
7 0.032285 0.242476 one
8 0.794938 0.607745 one
DF2:
I Y C
0 1 one
1 2 two
2 3 three
结果是: df_comb:
I A B C Y
0 0.719391 0.091693 one 1
1 0.951499 0.83716 one 1
2 0.975212 0.224855 one 1
3 0.80762 0.031284 three 3
4 0.63319 0.342889 one 1
5 0.075102 0.899291 two 2
6 0.502843 0.773424 two 2
7 0.032285 0.242476 one 1
8 0.794938 0.607745 one 1
因此,df_comb的列Y中的每一行(其中列C的值与df2的列C中的值匹配)应该在其列的Y列中的df2中具有相应的列Y值。
我尝试了一些加入和合并但没有成功。 有没有人知道怎么做而不使用for循环?
谢谢
答案 0 :(得分:3)
选项1
df.map
df['Y']=df.C.map(df2.set_index('C')['Y'])
df
Out[164]:
I A B C Y
0 0 0.719391 0.091693 one 1
1 1 0.951499 0.837160 one 1
2 2 0.975212 0.224855 one 1
3 3 0.807620 0.031284 three 3
4 4 0.633190 0.342889 one 1
5 5 0.075102 0.899291 two 2
6 6 0.502843 0.773424 two 2
7 7 0.032285 0.242476 one 1
8 8 0.794938 0.607745 one 1
选项2
df.merge
df.merge(df2, on='C', how='left')
A B C Y
0 0.719391 0.091693 one 1
1 0.951499 0.837160 one 1
2 0.975212 0.224855 one 1
3 0.633190 0.342889 one 1
4 0.032285 0.242476 one 1
5 0.794938 0.607745 one 1
6 0.807620 0.031284 three 3
7 0.075102 0.899291 two 2
8 0.502843 0.773424 two 2
选项3
df.replace
df.C.replace(df2.set_index('C').Y)
I
0 1
1 1
2 1
3 3
4 1
5 2
6 2
7 1
8 1
Name: C, dtype: int64