在pandas中水平执行pd.concat并与列匹配的最佳方法是什么?

时间:2018-08-16 19:02:53

标签: python pandas concatenation

示例:

Df1: ID, Variable_1, Variable_2
Df2: ID, Variable_1a, Variable_2a

我希望生成的Df与ID值(相同)匹配,并具有以下格式:

Df3: ID, Variable_1, Variable_2, Variable_1a, Variable_2a

我尝试过:Df3 = pd.concat([Df1, Df2], axis=1, join='outer'),但无法产生预期的结果。

1 个答案:

答案 0 :(得分:0)

Df1.merge(Df2, on="ID", how="outer")怎么样?这是一个玩具示例:

import pandas as pd

df1 = pd.DataFrame([[1, "varA", "varB"], [2, "varX", "varY"], [3, "varC", "varD"]], columns=["ID", "Variable_1", "Variable_2"])
df2 = pd.DataFrame([[1, "varA_a", "varB_a"], [2, "varX_a", "varY_a"]], columns=["ID", "Variable_1a", "Variable_2a"])

print(df1)
print()
print(df2)
print()
merged = df1.merge(df2, on="ID", how='outer')
print(merged)

它输出:

   ID Variable_1 Variable_2
0   1       varA       varB
1   2       varX       varY
2   3       varC       varD

   ID Variable_1a Variable_2a
0   1      varA_a      varB_a
1   2      varX_a      varY_a

   ID Variable_1 Variable_2 Variable_1a Variable_2a
0   1       varA       varB      varA_a      varB_a
1   2       varX       varY      varX_a      varY_a
2   3       varC       varD         NaN         NaN

这就是我想找的。