Question

考虑系列：

np.random.seed([3,1415])
s = pd.Series(np.random.rand(100),
              pd.MultiIndex.from_product([list('ABDCE'),
                                          list('abcde'),
                                          ['One', 'Two', 'Three', 'Four']]))

我可以groupby索引级别的组合并获得idxmax：

s.groupby(level=[0, 2]).idxmax()

A  Four      (A, c, Four)
   One        (A, d, One)
   Three    (A, c, Three)
   Two        (A, d, Two)
B  Four      (B, d, Four)
   One        (B, d, One)
   Three    (B, c, Three)
   Two        (B, b, Two)
C  Four      (C, b, Four)
   One        (C, a, One)
   Three    (C, a, Three)
   Two        (C, e, Two)
D  Four      (D, b, Four)
   One        (D, e, One)
   Three    (D, b, Three)
   Two        (D, c, Two)
E  Four      (E, c, Four)
   One        (E, a, One)
   Three    (E, c, Three)
   Two        (E, a, Two)
dtype: object

我希望每个中每个的数字位置。

我可以通过awesome answers to this question
获取数字位置
s.groupby(level=[0, 2]).idxmax().apply(lambda x: s.index.get_loc(x)) A Four 11 One 12 Three 10 Two 13 B Four 35 One 32 Three 30 Two 25 C Four 67 One 60 Three 62 Two 77 D Four 47 One 56 Three 46 Two 49 E Four 91 One 80 Three 90 Two 81 dtype: int64

但我想要这个：

A Four 2 One 3 Three 2 Two 3 B Four 3 One 3 Three 2 Two 1 C Four 1 One 0 Three 0 Two 4 D Four 1 One 4 Three 1 Two 2 E Four 2 One 0 Three 2 Two 0 dtype: int64

Answer 1

好吧，我终于有了一个解决方案，它使用NumPy的重塑方法，然后沿着其中一个轴操作，给我们argmax。我不确定这是否优雅，但我希望在性能方面会很好。另外，我假设用于多索引数据的pandas Series具有常规格式，即每个级别保持所有索引的元素数量。

这是实施 -

L0,L1,L2 = s.index.levels[:3]
IDs = s.sortlevel().values.reshape(-1,len(L0),len(L1),len(L2)).argmax(2)
sOut = pd.Series(IDs.ravel(),pd.MultiIndex.from_product([L0,L2]))

时间（pir的补充）

Answer 2

我就这样做了：

s.groupby(level=[0, 2]).apply(lambda x: x.index.get_loc(x.idxmax()))

A  Four     2
   One      3
   Three    2
   Two      3
B  Four     3
   One      3
   Three    2
   Two      1
C  Four     1
   One      0
   Three    0
   Two      4
D  Four     1
   One      4
   Three    1
   Two      2
E  Four     2
   One      0
   Three    2
   Two      0
dtype: int64

返回表示每个组内最大值的索引的一系列数字位置

2 个答案: