将数据与重叠相结合

时间:2014-02-17 18:27:34

标签: python join pandas concat

我有两个DataFrame:

data = {'First': ['Tom', 'Peter', 'Phil'], 'Last': ['Dwan', 'Laak', 'Ivey'], 
        'Score': [101.5, 99, 105]}
df = pd.DataFrame(data, index=list('abc'))
print df 

   First  Last  Score
a    Tom  Dwan  101.5
b  Peter  Laak   99.0
c   Phil  Ivey  105.0


data2 = {'First': ['Tom', 'Phil'], 'Last': ['Dwan', 'Ivey'], 'Score': [103.5, 101]}
df2 = pd.DataFrame(data2, index=list('fg'))
print df2 

  First  Last  Score
f   Tom  Dwan  103.5
g  Phil  Ivey  101.0

我希望将它们重叠在一起,以获得最终结果:

   First  Last  Score  Score_new
a    Tom  Dwan  101.5      103.5
b  Peter  Laak   99.0        NaN
c   Phil  Ivey  105.0      101.0

由于索引不匹配,因此必须加入FirstLast列。建议好吗?

1 个答案:

答案 0 :(得分:3)

如果您不关心保留索引,可以执行类似

的操作
>>> df.merge(df2, on=["First", "Last"], how='outer', suffixes=('', '_new'))
   First  Last  Score  Score_new
0    Tom  Dwan  101.5      103.5
1  Peter  Laak   99.0        NaN
2   Phil  Ivey  105.0      101.0

[3 rows x 4 columns]

如果你这样做,也许你可以玩left/right_index,比如

>>> df.merge(df2, on=["First", "Last"], how='outer', suffixes=('', '_new'), right_index=True)
   First  Last  Score  Score_new
a    Tom  Dwan  101.5      103.5
b  Peter  Laak   99.0        NaN
c   Phil  Ivey  105.0      101.0

[3 rows x 4 columns]

但我不知道为什么这些信件会很重要。