熊猫,合并2个数据帧

时间:2018-05-25 13:28:54

标签: python python-3.x pandas merge

我实际上有两个数据帧,一个是:

seq1_id seq2_id dN  dS  Dist1 Dist_brute  kingdom
seq1    seq2    45  56  23    455         eucaryota
seq6    seq9    34  43  34    453         procaryota
seq3    seq98   32  34  21    90          Virus
seq21   seq87   32  12  35    211         Virus

和另一个像:

seq1_id seq2_id dN  dS  Dist1 Dist_brute
seq1    seq2    45  56  23    455
seq4    seq12   78  45  32    789
seq3    seq98   32  34  21    90          
seq21   seq87   32  12  35    211 
seq45   seq90   21  23  12    123
seq6    seq9    34  43  34    453  

我想做的是获得一个新的数据框:

seq1_id seq2_id dN  dS  Dist1 Dist_brute   kingdom
seq1    seq2    45  56  23    455          eucaryota
seq4    seq12   78  45  32    789          NaN
seq3    seq98   32  34  21    90           Virus
seq21   seq87   32  12  35    211          Virus
seq45   seq90   21  23  12    123          NaN
seq6    seq9    34  43  34    453          procaryota

有人有想法吗? 谢谢:))

1 个答案:

答案 0 :(得分:1)

对我来说,工作省略参数on,以便left加入所有列的合并:

df = df2.merge(df1, how='left')

如果需要为merge定义列:

df = df2.merge(df1, on=['seq1_id','seq2_id','dN','dS','Dist1','Dist_brute'], how='left')
print (df)
  seq1_id seq2_id  dN  dS  Dist1  Dist_brute     kingdom
0    seq1    seq2  45  56     23         455   eucaryota
1    seq4   seq12  78  45     32         789         NaN
2    seq3   seq98  32  34     21          90       Virus
3   seq21   seq87  32  12     35         211       Virus
4   seq45   seq90  21  23     12         123         NaN
5    seq6    seq9  34  43     34         453  procaryota