添加共享相同列的数据帧,并再扩展一个维度

时间:2017-09-18 12:10:27

标签: pandas dataframe

我想将两个共享相同列的数据帧相加

df1=pd.DataFrame(np.random.randn(3,3),index=list("ABC"),columns=list("XYZ"))
df2=pd.DataFrame(np.random.randn(3,3),index=list("abc"),columns=list("XYZ"))

我想要的结果是:

    X Y Z
A a 
A b
A c 
....
C c

我怎样才能做到这一点?

我尝试过以下但未得到我想要的东西。

df1.add(df2,axis="columns")

2 个答案:

答案 0 :(得分:1)

IIUIC,这是一种方法,在临时merge上使用k,导致每个索引组合,然后在列上groupby

In [192]: (df1.reset_index().assign(k='k').merge(df2.assign(k='k').reset_index(), on=['k'])
              .set_index(['index_x', 'index_y'])
              .groupby(lambda x:x.split('_')[0], axis=1)
              .sum()
              .drop('k', 1))
Out[192]:
                        X         Y         Z
index_x index_y
A       a       -2.281005 -1.606760 -0.853813
        b       -2.683788 -2.487876  2.471459
        c       -0.333471 -2.155734  1.688883
B       a       -0.790146  0.074629 -2.368680
        b       -1.192928 -0.806487  0.956592
        c        1.157388 -0.474345  0.174017
C       a       -2.114412  0.100412 -2.352661
        b       -2.517195 -0.780704  0.972611
        c       -0.166878 -0.448562  0.190036

详细

In [193]: df1
Out[193]:
          X         Y         Z
A -1.087129 -1.264522  1.147618
B  0.403731  0.416867 -0.367249
C -0.920536  0.442650 -0.351229

In [194]: df2
Out[194]:
          X         Y         Z
a -1.193876 -0.342237 -2.001431
b -1.596659 -1.223354  1.323841
c  0.753658 -0.891211  0.541265

In [196]: (df1.reset_index().assign(k='k').merge(df2.assign(k='k').reset_index(), on=['k'])
              .set_index(['index_x', 'index_y']))
Out[196]:
                      X_x       Y_x       Z_x  k       X_y       Y_y       Z_y
index_x index_y
A       a       -1.087129 -1.264522  1.147618  k -1.193876 -0.342237 -2.001431
        b       -1.087129 -1.264522  1.147618  k -1.596659 -1.223354  1.323841
        c       -1.087129 -1.264522  1.147618  k  0.753658 -0.891211  0.541265
B       a        0.403731  0.416867 -0.367249  k -1.193876 -0.342237 -2.001431
        b        0.403731  0.416867 -0.367249  k -1.596659 -1.223354  1.323841
        c        0.403731  0.416867 -0.367249  k  0.753658 -0.891211  0.541265
C       a       -0.920536  0.442650 -0.351229  k -1.193876 -0.342237 -2.001431
        b       -0.920536  0.442650 -0.351229  k -1.596659 -1.223354  1.323841
        c       -0.920536  0.442650 -0.351229  k  0.753658 -0.891211  0.541265

答案 1 :(得分:1)

您可以MultiIndex先在DataFrame中创建MultiIndex,然后在DataFrames np.random.seed(45) df1=pd.DataFrame(np.random.randn(3,3),index=list("ABC"),columns=list("XYZ")) df2=pd.DataFrame(np.random.randn(3,3),index=list("abc"),columns=list("XYZ")) mux = pd.MultiIndex.from_product([df1.index, df2.index]) df1 = df1.reindex(mux, level=0) df2 = df2.reindex(mux, level=1) print (df1) X Y Z A a 0.026375 0.260322 -0.395146 b 0.026375 0.260322 -0.395146 c 0.026375 0.260322 -0.395146 B a -0.204301 -1.271633 -2.596879 b -0.204301 -1.271633 -2.596879 c -0.204301 -1.271633 -2.596879 C a 0.289681 -0.873305 0.394073 b 0.289681 -0.873305 0.394073 c 0.289681 -0.873305 0.394073 print (df2) X Y Z A a 0.935106 -0.015685 0.259596 b -1.473314 0.801927 -1.750752 c -0.495052 -1.008601 0.025244 B a 0.935106 -0.015685 0.259596 b -1.473314 0.801927 -1.750752 c -0.495052 -1.008601 0.025244 C a 0.935106 -0.015685 0.259596 b -1.473314 0.801927 -1.750752 c -0.495052 -1.008601 0.025244 df3 = df1.add(df2,axis="columns") print (df3) X Y Z A a 0.961480 0.244637 -0.135550 b -1.446939 1.062248 -2.145898 c -0.468677 -0.748279 -0.369901 B a 0.730805 -1.287317 -2.337283 b -1.677615 -0.469706 -4.347631 c -0.699353 -2.280233 -2.571634 C a 1.224786 -0.888989 0.653669 b -1.183633 -0.071378 -1.356680 c -0.205371 -1.881905 0.419317 创建ArtistsControllerTests.Setup()

Setup
public ArtistsControllerTests() {
    _mockArtistsService.Reset();
    _mockPermissionsService
      .Setup(service => service.GetPermissionsAsync(It.IsAny<HttpContext>()))
      .Returns(Task.FromResult(new Permissions { UserId = "112233", IsAdministrator = false }));
    _mockArtistsService.Setup(service => service.GetAllArtists(It.IsAny<string>(), false)).Returns(new ArtistCardDtoCollection());
}