我有以下4个数据帧
df = pd.DataFrame({_id:[1,2,3,4], name:[Charan, Kumar, Nikhil, Kumar], })
df1 = pd.DataFrame({_id:[1,3,4], count_of_apple:[5,3,1]})
df2 = pd.DataFrame({_id:[1,2,3], count_of_organge:[8,4,6]})
df3 = pd.DataFrame({_id:[2,3,4], count_of_lime:[7,9,2]})
我想将所有数据帧合并到一个称为最终
的单个数据帧中我尝试过使用PD.merge,但是它的问题是我必须在3次不同的时间里做,有没有更简单的方法呢?
我使用以下代码获取结果
final = pd.merge(df, df1, on='_id', how='left')
final = pd.merge(final, df2, on='_id', how='left')
final = pd.merge(final, df3, on='_id', how='left')
我希望最终结果是这样
final.head()
_id |名称|橙色数|苹果数|石灰的数量
1 |查兰| 5 | 8 |钠
2 |库玛|娜| 4 | 7
3 | Nikhil | 3 | 6 | 9
4 |库玛| 1 |娜| 2
答案 0 :(得分:1)
您可以使用concat
,但首先需要将_id
转换为DataFrame.set_index
为每个DataFrame
编制索引:
dfs = [df, df1, df2, df3]
df = pd.concat([x.set_index('_id') for x in dfs], axis=1).reset_index()
是什么意思?
df = df.set_index('_id')
df1 = df1.set_index('_id')
df2 = df2.set_index('_id')
df3 = df3.set_index('_id')
df = pd.concat([df, df1, df2, df3], axis=1).reset_index()
print (df)
_id name count_of_apple count_of_organge count_of_lime
0 1 Charan 5.0 8.0 NaN
1 2 Kumar NaN 4.0 7.0
2 3 Nikhil 3.0 6.0 9.0
3 4 Kumar 1.0 NaN 2.0
答案 1 :(得分:0)
来自文档https://pandas.pydata.org/pandas-docs/stable/user_guide/merging.html
In [1]: df1 = pd.DataFrame({'A': ['A0', 'A1', 'A2', 'A3'],
...: 'B': ['B0', 'B1', 'B2', 'B3'],
...: 'C': ['C0', 'C1', 'C2', 'C3'],
...: 'D': ['D0', 'D1', 'D2', 'D3']},
...: index=[0, 1, 2, 3])
...:
In [8]: df4 = pd.DataFrame({'B': ['B2', 'B3', 'B6', 'B7'],
...: 'D': ['D2', 'D3', 'D6', 'D7'],
...: 'F': ['F2', 'F3', 'F6', 'F7']},
...: index=[2, 3, 6, 7])
...:
In [9]: result = pd.concat([df1, df4], axis=1, sort=False)