假设我有数据框df1
和df2
。
c1 = np.repeat(['a','b'], [8, 8], axis=0)
c2 = list('xxxxyyyyxxxxyyyy')
c3 = ['G1','G1','G2','G2','G1','G1','G2','G2','G1','G1','G2','G2','G1','G1','G2','G2']
c4 = [1,2]*8
val1 = np.random.rand(16)
df1 = pd.DataFrame({'c1':c1,'c2':c2,'c3':c3,'c4':c4,'val':val1})
df2 = pd.DataFrame({'c1':['a','b','a','b'],'c2':['x','x','y','y'],'val2':[100,90,221,92]})
如何使用df2
在df1
上创建包含val2
中值的列?
输出应如下所示:
c1 c2 c3 c4 val1 val2
0 a x G1 1 0.67 100
1 a x G1 2 0.36 100
2 a x G2 1 0.12 100
3 a x G2 2 0.31 100
4 a y G1 1 0.56 221
5 a y G1 2 0.92 221
6 a y G2 1 0.62 221
7 a y G2 2 0.99 221
8 b x G1 1 0.73 90
9 b x G1 2 0.56 90
10 b x G2 1 0.22 90
11 b x G2 2 0.91 90
12 b y G1 1 0.34 92
13 b y G1 2 0.39 92
14 b y G2 1 0.78 92
15 b y G2 2 0.63 92
答案 0 :(得分:1)
我认为您可以使用merge
:
print pd.merge(df1,df2,on=['c1','c2'])
c1 c2 c3 c4 val val2
0 a x G1 1 0.600033 100
1 a x G1 2 0.929101 100
2 a x G2 1 0.311034 100
3 a x G2 2 0.341437 100
4 a y G1 1 0.512890 221
5 a y G1 2 0.124317 221
6 a y G2 1 0.428409 221
7 a y G2 2 0.047169 221
8 b x G1 1 0.485116 90
9 b x G1 2 0.960812 90
10 b x G2 1 0.347445 90
11 b x G2 2 0.490705 90
12 b y G1 1 0.273342 92
13 b y G1 2 0.784263 92
14 b y G2 1 0.805600 92
15 b y G2 2 0.057058 92