在pandas中合并两个数据帧

时间:2017-10-30 09:40:27

标签: python pandas dataframe

我有一个解决问题的逻辑问题。我有两个数据帧。

Dataframe_one包含以下列:

[Id, workflowprofile_A, workflow_profile_B, important_string_info ]

Dataframe_two包含以下列:

[workflowprofile, option, workflow]

我的问题是来自Dataframe_two的workflowprofile可以是来自Dataframe_one的workflowprofile_A OR和AND workflow_profile_B。如何获得合并的数据框,其中列将如下所示。

dataframe_three:

[Id, workflowprofile_A,workflowprofile_fromA, option_fromA, workflow_fromA,important_string_info_fromA  workflow_profile_B, workflowprofile_fromB, option_fromB, workflow_fromB, important_string_info_fromB]

1 个答案:

答案 0 :(得分:2)

您可以按fillnacombine_first创建新列,因为始终有一个值为NaN,然后按此列合并:

df1['workflowprofile'] = df1['workflowprofile_A'].fillna(df1['workflow_profile_B'])
#alternative
#df1['workflowprofile'] = df1['workflowprofile_A'].combine_first(df1['workflow_profile_B'])

df3 = pd.merge(df1, df2, on='workflowprofile')

样品:

print (df1)
   Id  workflowprofile_A  workflow_profile_B  important_string_info
0   1                7.0                 NaN                      8
1   2                NaN                 5.0                      1    

print (df2)
   workflowprofile  option  workflow
0                7       0         0
1                5       9         0
2                7       0         0
3                4       1         2

df1['workflowprofile'] = df1['workflowprofile_A'].fillna(df1['workflow_profile_B'])
df3 = pd.merge(df1, df2, on='workflowprofile')
print (df3)
   Id  workflowprofile_A  workflow_profile_B  important_string_info  \
0   1                7.0                 NaN                      8   
1   1                7.0                 NaN                      8   
2   2                NaN                 5.0                      1   

  workflowprofile  option  workflow  
0               7       0         0  
1               7       0         0  
2               5       9         0