将另一个数据框合并到现有行

时间:2017-12-18 23:35:02

标签: python pandas dataframe

我有2个数据框dfsubs

df = pd.DataFrame({"scode": [11, 22, 33, 44], "sname": ["aa", "bb", "cc", "dd"], "sub1": [ "London", np.nan, "Delhi", np.nan], "sub2": [np.nan, np.nan, "Sydney", np.nan]})

   scode sname  sub1    sub2
0   11   aa     London  NaN
1   22   bb     NaN     NaN
2   33   cc     Delhi   Sydney
3   44   dd     NaN     NaN

subs = {0: [22, 44], 1: ["Milford Sound", "Queenstown"], 2: ["Oslo", np.nan]}

    0   1               2
0   22  Milford Sound   Oslo
1   44  Queenstown      NaN

如何合并2个数据帧并最终得到结果:

    scode   sname   sub1            sub2
0   11      aa      London          NaN
1   22      bb      Milford Sound   Oslo
2   33      cc      Delhi           Sydney
3   44      dd      Queenstown      NaN

2 个答案:

答案 0 :(得分:1)

Pandas将自动对齐索引/列,只需确保设置正确的索引,假设scode是你想要合并的东西:

In [5]: df = pd.DataFrame({"scode": [11, 22, 33, 44], "sname": ["aa", "bb", "cc", "dd"], "sub1": [ "London", np.nan, "Delhi", np.nan], "sub2": [np.nan, np.nan, "Sydne
    ...: y", np.nan]})
    ...:

In [6]: df.set_index('scode',inplace=True)

In [7]: subs = pd.DataFrame({0: [22, 44], 1: ["Milford Sound", "Queenstown"], 2: ["Oslo", np.nan]})
    ...:

In [8]: subs.set_index(0, inplace=True)

In [9]: subs.columns=['sub1','sub2']

给你类似的东西:

In [10]: df
Out[10]:
      sname    sub1    sub2
scode
11       aa  London     NaN
22       bb     NaN     NaN
33       cc   Delhi  Sydney
44       dd     NaN     NaN

In [11]: subs
Out[11]:
             sub1  sub2
0
22  Milford Sound  Oslo
44     Queenstown   NaN

现在,只需进行正常分配,选择合适的列/索引:

In [12]: df.loc[subs.index.values,['sub1', 'sub2']] = subs

In [13]: df
Out[13]:
      sname           sub1    sub2
scode
11       aa         London     NaN
22       bb  Milford Sound    Oslo
33       cc          Delhi  Sydney
44       dd     Queenstown     NaN

您可以随时重置以前使用的索引:

In [14]: df.reset_index(inplace=True)

In [15]: df
Out[15]:
   scode sname           sub1    sub2
0     11    aa         London     NaN
1     22    bb  Milford Sound    Oslo
2     33    cc          Delhi  Sydney
3     44    dd     Queenstown     NaN

答案 1 :(得分:0)

首先,让我们让您的列名匹配:

newSub = sub.rename(columns={0:'scode', 1:'sub1', 2:'sub2'})

接下来,数据框的update方法根据源行和目标行之间的公共索引执行您想要的操作。所以,让我们将索引设置为scode:

indexedDF     = df.set_index('scode')
indexedNewSub = newSub.set_index('scode')

最后,使用indexedDF更新的方法进行更新:

indexedDF.update(indexedNewSub)

indexedDF现在应该按要求合并subs