我有2个数据框df
和subs
:
df = pd.DataFrame({"scode": [11, 22, 33, 44], "sname": ["aa", "bb", "cc", "dd"], "sub1": [ "London", np.nan, "Delhi", np.nan], "sub2": [np.nan, np.nan, "Sydney", np.nan]})
scode sname sub1 sub2
0 11 aa London NaN
1 22 bb NaN NaN
2 33 cc Delhi Sydney
3 44 dd NaN NaN
subs = {0: [22, 44], 1: ["Milford Sound", "Queenstown"], 2: ["Oslo", np.nan]}
0 1 2
0 22 Milford Sound Oslo
1 44 Queenstown NaN
如何合并2个数据帧并最终得到结果:
scode sname sub1 sub2
0 11 aa London NaN
1 22 bb Milford Sound Oslo
2 33 cc Delhi Sydney
3 44 dd Queenstown NaN
答案 0 :(得分:1)
Pandas将自动对齐索引/列,只需确保设置正确的索引,假设scode
是你想要合并的东西:
In [5]: df = pd.DataFrame({"scode": [11, 22, 33, 44], "sname": ["aa", "bb", "cc", "dd"], "sub1": [ "London", np.nan, "Delhi", np.nan], "sub2": [np.nan, np.nan, "Sydne
...: y", np.nan]})
...:
In [6]: df.set_index('scode',inplace=True)
In [7]: subs = pd.DataFrame({0: [22, 44], 1: ["Milford Sound", "Queenstown"], 2: ["Oslo", np.nan]})
...:
In [8]: subs.set_index(0, inplace=True)
In [9]: subs.columns=['sub1','sub2']
给你类似的东西:
In [10]: df
Out[10]:
sname sub1 sub2
scode
11 aa London NaN
22 bb NaN NaN
33 cc Delhi Sydney
44 dd NaN NaN
In [11]: subs
Out[11]:
sub1 sub2
0
22 Milford Sound Oslo
44 Queenstown NaN
现在,只需进行正常分配,选择合适的列/索引:
In [12]: df.loc[subs.index.values,['sub1', 'sub2']] = subs
In [13]: df
Out[13]:
sname sub1 sub2
scode
11 aa London NaN
22 bb Milford Sound Oslo
33 cc Delhi Sydney
44 dd Queenstown NaN
您可以随时重置以前使用的索引:
In [14]: df.reset_index(inplace=True)
In [15]: df
Out[15]:
scode sname sub1 sub2
0 11 aa London NaN
1 22 bb Milford Sound Oslo
2 33 cc Delhi Sydney
3 44 dd Queenstown NaN
答案 1 :(得分:0)
首先,让我们让您的列名匹配:
newSub = sub.rename(columns={0:'scode', 1:'sub1', 2:'sub2'})
接下来,数据框的update
方法根据源行和目标行之间的公共索引执行您想要的操作。所以,让我们将索引设置为scode:
indexedDF = df.set_index('scode')
indexedNewSub = newSub.set_index('scode')
最后,使用indexedDF
更新的方法进行更新:
indexedDF.update(indexedNewSub)
indexedDF
现在应该按要求合并subs
。