我想将两个共享相同列的数据帧相加
df1=pd.DataFrame(np.random.randn(3,3),index=list("ABC"),columns=list("XYZ"))
df2=pd.DataFrame(np.random.randn(3,3),index=list("abc"),columns=list("XYZ"))
我想要的结果是:
X Y Z
A a
A b
A c
....
C c
我怎样才能做到这一点?
我尝试过以下但未得到我想要的东西。
df1.add(df2,axis="columns")
答案 0 :(得分:1)
IIUIC,这是一种方法,在临时merge
上使用k
,导致每个索引组合,然后在列上groupby
。
In [192]: (df1.reset_index().assign(k='k').merge(df2.assign(k='k').reset_index(), on=['k'])
.set_index(['index_x', 'index_y'])
.groupby(lambda x:x.split('_')[0], axis=1)
.sum()
.drop('k', 1))
Out[192]:
X Y Z
index_x index_y
A a -2.281005 -1.606760 -0.853813
b -2.683788 -2.487876 2.471459
c -0.333471 -2.155734 1.688883
B a -0.790146 0.074629 -2.368680
b -1.192928 -0.806487 0.956592
c 1.157388 -0.474345 0.174017
C a -2.114412 0.100412 -2.352661
b -2.517195 -0.780704 0.972611
c -0.166878 -0.448562 0.190036
详细
In [193]: df1
Out[193]:
X Y Z
A -1.087129 -1.264522 1.147618
B 0.403731 0.416867 -0.367249
C -0.920536 0.442650 -0.351229
In [194]: df2
Out[194]:
X Y Z
a -1.193876 -0.342237 -2.001431
b -1.596659 -1.223354 1.323841
c 0.753658 -0.891211 0.541265
In [196]: (df1.reset_index().assign(k='k').merge(df2.assign(k='k').reset_index(), on=['k'])
.set_index(['index_x', 'index_y']))
Out[196]:
X_x Y_x Z_x k X_y Y_y Z_y
index_x index_y
A a -1.087129 -1.264522 1.147618 k -1.193876 -0.342237 -2.001431
b -1.087129 -1.264522 1.147618 k -1.596659 -1.223354 1.323841
c -1.087129 -1.264522 1.147618 k 0.753658 -0.891211 0.541265
B a 0.403731 0.416867 -0.367249 k -1.193876 -0.342237 -2.001431
b 0.403731 0.416867 -0.367249 k -1.596659 -1.223354 1.323841
c 0.403731 0.416867 -0.367249 k 0.753658 -0.891211 0.541265
C a -0.920536 0.442650 -0.351229 k -1.193876 -0.342237 -2.001431
b -0.920536 0.442650 -0.351229 k -1.596659 -1.223354 1.323841
c -0.920536 0.442650 -0.351229 k 0.753658 -0.891211 0.541265
答案 1 :(得分:1)
您可以MultiIndex
先在DataFrame
中创建MultiIndex
,然后在DataFrames
np.random.seed(45)
df1=pd.DataFrame(np.random.randn(3,3),index=list("ABC"),columns=list("XYZ"))
df2=pd.DataFrame(np.random.randn(3,3),index=list("abc"),columns=list("XYZ"))
mux = pd.MultiIndex.from_product([df1.index, df2.index])
df1 = df1.reindex(mux, level=0)
df2 = df2.reindex(mux, level=1)
print (df1)
X Y Z
A a 0.026375 0.260322 -0.395146
b 0.026375 0.260322 -0.395146
c 0.026375 0.260322 -0.395146
B a -0.204301 -1.271633 -2.596879
b -0.204301 -1.271633 -2.596879
c -0.204301 -1.271633 -2.596879
C a 0.289681 -0.873305 0.394073
b 0.289681 -0.873305 0.394073
c 0.289681 -0.873305 0.394073
print (df2)
X Y Z
A a 0.935106 -0.015685 0.259596
b -1.473314 0.801927 -1.750752
c -0.495052 -1.008601 0.025244
B a 0.935106 -0.015685 0.259596
b -1.473314 0.801927 -1.750752
c -0.495052 -1.008601 0.025244
C a 0.935106 -0.015685 0.259596
b -1.473314 0.801927 -1.750752
c -0.495052 -1.008601 0.025244
中df3 = df1.add(df2,axis="columns")
print (df3)
X Y Z
A a 0.961480 0.244637 -0.135550
b -1.446939 1.062248 -2.145898
c -0.468677 -0.748279 -0.369901
B a 0.730805 -1.287317 -2.337283
b -1.677615 -0.469706 -4.347631
c -0.699353 -2.280233 -2.571634
C a 1.224786 -0.888989 0.653669
b -1.183633 -0.071378 -1.356680
c -0.205371 -1.881905 0.419317
创建ArtistsControllerTests.Setup()
:
Setup
public ArtistsControllerTests() {
_mockArtistsService.Reset();
_mockPermissionsService
.Setup(service => service.GetPermissionsAsync(It.IsAny<HttpContext>()))
.Returns(Task.FromResult(new Permissions { UserId = "112233", IsAdministrator = false }));
_mockArtistsService.Setup(service => service.GetAllArtists(It.IsAny<string>(), false)).Returns(new ArtistCardDtoCollection());
}