我试图在键“ id”上合并3个pandas数据帧,但是以某种方式无法获得正确的结果。
最后,我想要一个具有2行的数据框,其中一个ID为'abc'和对象(something1,1),(something1,1),而ID为'def'的行具有object2(something,1)和对象(某物,1)。 有没有办法用熊猫来做到这一点?
import pandas as pd
df1 = pd.DataFrame([[]])
df1['id'] ='abc'
df1['object'] = -1
df1['object'] = df1['object'].astype('object')
df1.at[0,'object'] = ('something', 1)
df1['object3'] = -1
df1['object3'] = df1['object3'].astype('object')
df1.at[0,'object3'] = ('something1', 1)
df2 = pd.DataFrame([[]])
df2['id'] ='def'
df2['object2'] = -1
df2['object2'] = df2['object2'].astype('object')
df2.at[0,'object2'] = ('something2', 1)
df3 = pd.DataFrame([[]])
df3['id'] ='def'
df3['object3'] = -1
df3['object3'] = df3['object3'].astype('object')
df3.at[0,'object3'] = ('something3', 1)
编辑:
对不起,我最初的问题还不清楚:我希望数据框最后看起来像这样:
| id | object | object2 | object3 |
|-----|-----------------|------------------|------------------|
| abc | ('something',1) | None | ('something1',1) |
| def | None | ('something2',1) | ('something3',1) |
答案 0 :(得分:1)
concat
和groupby
使用first
解决潜在的非唯一性。这相当健壮。
pd.concat([df1, df2, df3]).groupby('id', as_index=False).first()
id object object3 object2
0 abc (something, 1) (something1, 1) NaN
1 def NaN (something3, 1) (something2, 1)