尝试使用具有多索引的乘法运算。
import pandas as pd
import numpy as np
d = {'Alpha': [1,2,3,4,5,6,7,8,9]
,'Beta':tuple('ABCDEFGHI')
,'C': np.random.randint(1,10,9)
,'D': np.random.randint(100,200,9)
}
df = pd.DataFrame(d)
df.set_index(['Alpha','Beta'],inplace=True)
df = df.stack() #it's now a series
df.index.names = df.index.names[:-1] + ['Gamma']
ser = pd.Series(data = np.random.rand(9))
ser.index = pd.MultiIndex.from_tuples(zip(range(1,10),np.repeat('C',9)))
ser.index.names = ['Alpha','Gamma']
print df
print ser
foo = df.mul(ser,axis=0,level = ['Alpha','Gamma'])
所以我的数据框成为一个系列看起来像
Alpha Beta Gamma
1 A C 7
D 188
2 B C 7
D 110
3 C C 2
D 124
4 D C 4
D 153
5 E C 9
D 178
6 F C 6
D 196
7 G C 1
D 156
8 H C 1
D 184
9 I C 3
D 169
我的系列看起来像
Alpha Gamma
1 C 0.8731
2 C 0.6347
3 C 0.4688
4 C 0.5623
5 C 0.4944
6 C 0.5234
7 C 0.9946
8 C 0.7815
9 C 0.1219
在我的multiply
操作中,我想在索引级别'Alpha'
和'Gamma'
上广播
但我收到此错误消息:
TypeError:两个MultiIndex对象之间的连接是不明确的
答案 0 :(得分:2)
这个怎么样?也许这是df中额外的“Beta”列,但不会导致问题?
(注意:这是使用df在@Dickster的答案中更新,而不是在原始问题中)
df2 = df.reset_index().set_index(['Alpha','Gamma'])
df2[0].mul(ser)
Alpha Gamma
1 C 2.503829
D NaN
2 C 5.028208
D NaN
3 C 0.842322
D NaN
4 C 0.198101
D NaN
5 C 0.800745
D NaN
6 C 1.936523
D NaN
7 C 2.507393
D NaN
8 C 4.846258
D NaN
9 C NaN
D 147.233378
答案 1 :(得分:1)
想象一下,我有这个,我现在有一个' D'在Gamma中的系列" ser":
import pandas as pd
import numpy as np
np.random.seed(1)
d = {'Alpha': [1,2,3,4,5,6,7,8,9]
,'Beta':tuple('ABCDEFGHI')
,'C': np.random.randint(1,10,9)
,'D': np.random.randint(100,200,9)
}
df = pd.DataFrame(d)
df.set_index(['Alpha','Beta'],inplace=True)
df = df.stack() #it's now a series
df.index.names = df.index.names[:-1] + ['Gamma']
ser = pd.Series(data = np.random.rand(9))
idx = list(np.repeat('C',8))
idx.append('D')
ser.index = pd.MultiIndex.from_tuples(zip(range(1,10),idx))
ser.index.names = ['Alpha','Gamma']
print df
print ser
df_A = df.unstack('Alpha').mul(ser).stack('Alpha').reorder_levels(df.index.names)
print df_A
df_dickster77 = df.unstack('Alpha').mul(ser.unstack('Alpha')).stack('Alpha').reorder_levels(df.index.names)
print df_dickster77
输出是这样的:
Alpha Beta Gamma
1 A C 6
D 120
2 B C 9
D 118
3 C C 6
D 184
4 D C 1
D 111
5 E C 1
D 128
6 F C 2
D 129
7 G C 8
D 114
8 H C 7
D 150
9 I C 3
D 168
dtype: int32
Alpha Gamma
1 C 0.417305
2 C 0.558690
3 C 0.140387
4 C 0.198101
5 C 0.800745
6 C 0.968262
7 C 0.313424
8 C 0.692323
9 D 0.876389
dtype: float64
输出A:无意的乘法
Gamma C D
Alpha Beta Gamma
1 A C 2.503829 NaN
D 50.076576 NaN
2 B C 5.028208 NaN
D 65.925400 NaN
3 C C 0.842322 NaN
D 25.831197 NaN
4 D C 0.198101 NaN
D 21.989265 NaN
5 E C 0.800745 NaN
D 102.495305 NaN
6 F C 1.936523 NaN
D 124.905743 NaN
7 G C 2.507393 NaN
D 35.730356 NaN
8 H C 4.846258 NaN
D 103.848392 NaN
9 I C NaN 2.629167
D NaN 147.233378
输出df_dickster77:正确的乘法排列在C&D和D上。 然而,8 x D NaNs损失,1 x C NaN损失
Alpha Beta Gamma
1 A C 2.503829
2 B C 5.028208
3 C C 0.842322
4 D C 0.198101
5 E C 0.800745
6 F C 1.936523
7 G C 2.507393
8 H C 4.846258
9 I D 147.233378
dtype: float64
答案 2 :(得分:0)
这是ATM的方法。在某些时候,可以实现更简洁。
In [21]: df.unstack('Alpha').mul(ser).stack('Alpha').reorder_levels(df.index.names)
Out[21]:
Gamma C
Alpha Beta Gamma
1 A C 6.761867
D 171.944612
2 B C 0.154139
D 6.371062
3 C C 2.311870
D 42.898041
4 D C 0.390920
D 9.479801
5 E C 3.484439
D 72.011743
6 F C 0.740913
D 50.382061
7 G C 3.459497
D 60.541203
8 H C 0.467012
D 19.030741
9 I C 0.071290
D 11.620286