我有两个DataFrame:
data = {'First': ['Tom', 'Peter', 'Phil'], 'Last': ['Dwan', 'Laak', 'Ivey'],
'Score': [101.5, 99, 105]}
df = pd.DataFrame(data, index=list('abc'))
print df
First Last Score
a Tom Dwan 101.5
b Peter Laak 99.0
c Phil Ivey 105.0
data2 = {'First': ['Tom', 'Phil'], 'Last': ['Dwan', 'Ivey'], 'Score': [103.5, 101]}
df2 = pd.DataFrame(data2, index=list('fg'))
print df2
First Last Score
f Tom Dwan 103.5
g Phil Ivey 101.0
我希望将它们重叠在一起,以获得最终结果:
First Last Score Score_new
a Tom Dwan 101.5 103.5
b Peter Laak 99.0 NaN
c Phil Ivey 105.0 101.0
由于索引不匹配,因此必须加入First
和Last
列。建议好吗?
答案 0 :(得分:3)
如果您不关心保留索引,可以执行类似
的操作>>> df.merge(df2, on=["First", "Last"], how='outer', suffixes=('', '_new'))
First Last Score Score_new
0 Tom Dwan 101.5 103.5
1 Peter Laak 99.0 NaN
2 Phil Ivey 105.0 101.0
[3 rows x 4 columns]
如果你这样做,也许你可以玩left/right_index
,比如
>>> df.merge(df2, on=["First", "Last"], how='outer', suffixes=('', '_new'), right_index=True)
First Last Score Score_new
a Tom Dwan 101.5 103.5
b Peter Laak 99.0 NaN
c Phil Ivey 105.0 101.0
[3 rows x 4 columns]
但我不知道为什么这些信件会很重要。