使用类似的列合并2个数据帧

时间:2016-09-29 08:03:04

标签: python pandas dataframe intersection concat

我有2个数据框如下所示

DF

 Type       Breed     Common Color  Other Color  Behaviour
 Golden      Big           Gold          White        Fun      
 Corgi      Small          Brown         White       Crazy
 Bulldog    Medium         Black         Grey        Strong

DF2

 Type              Breed    Behaviour   Bark Sound
 Pug               Small      Sleepy          Ak
 German Shepard    Big        Cool            Woof
 Puddle            Small      Aggressive      Ek

我想按列TypeBreedBehavior合并2个数据框。

因此,我的愿望输出是:

Type           Breed      Behavior
Golden          Big         Fun
Corgi           Small       Crazy  
Bulldog         Medium      Strong
Pug             Small       Sleepy
German Shepard  Big         Cool
Puddle          Small       Aggressive

2 个答案:

答案 0 :(得分:4)

您需要concat

print (pd.concat([df1[['Type','Breed','Behaviour']], 
                  df2[['Type','Breed','Behaviour']]], ignore_index=True))

             Type   Breed   Behaviour
0          Golden     Big         Fun
1           Corgi   Small       Crazy
2         Bulldog  Medium      Strong
3             Pug   Small      Sleepy
4  German Shepard     Big        Cool
5          Puddle   Small  Aggressive

更常见的是对DataFrames

的列使用intersection
cols = df1.columns.intersection(df2.columns)
print (cols)
Index(['Type', 'Breed', 'Behaviour'], dtype='object')

print (pd.concat([df1[cols], df2[cols]], ignore_index=True))
             Type   Breed   Behaviour
0          Golden     Big         Fun
1           Corgi   Small       Crazy
2         Bulldog  Medium      Strong
3             Pug   Small      Sleepy
4  German Shepard     Big        Cool
5          Puddle   Small  Aggressive

如果df1df2没有NaN值,则使用dropna删除NaN列时更为通用:

print (pd.concat([df1 ,df2], ignore_index=True))
     Bark Sound   Behaviour   Breed Common Color Other Color            Type
0        NaN         Fun     Big         Gold       White          Golden
1        NaN       Crazy   Small        Brown       White           Corgi
2        NaN      Strong  Medium        Black        Grey         Bulldog
3         Ak      Sleepy   Small          NaN         NaN             Pug
4       Woof        Cool     Big          NaN         NaN  German Shepard
5         Ek  Aggressive   Small          NaN         NaN          Puddle               


print (pd.concat([df1 ,df2], ignore_index=True).dropna(1))
    Behaviour   Breed            Type
0         Fun     Big          Golden
1       Crazy   Small           Corgi
2      Strong  Medium         Bulldog
3      Sleepy   Small             Pug
4        Cool     Big  German Shepard
5  Aggressive   Small          Puddle

答案 1 :(得分:3)

使用df1.T.join(df2.T, lsuffix='_').dropna().T.reset_index(drop=True) 删除不重叠的列

data frame:
TCGA-TS-A7P1-01A-41D-A39S-05     0.8637304    
TCGA-NQ-A57I-01A-11D-A34E-05     0.7812147    
TCGA-3H-AB3O-01A-11D-A39S-05     0.8963944    
TCGA-LK-A4O2-01A-11D-A34E-05     0.6942843    
TCGA-MQ-A4LI-01A-11D-A34E-05     0.8882558    

desired output:
TCGA-TS-A7P1-01A    41D-A39S-05    0.8637304    
TCGA-NQ-A57I-01A    11D-A34E-05    0.7812147    
TCGA-3H-AB3O-01A    11D-A39S-05    0.8963944    
TCGA-LK-A4O2-01A    11D-A34E-05    0.6942843    
TCGA-MQ-A4LI-01A    11D-A34E-05    0.8882558    

enter image description here