Question

我想先说我是python的初学者，但是我在这里有这个DataFrame：

df = pd.DataFrame({'countingVariable': ['a', 'a', 'a', 'a', 'a', 'a', 'a', 'a', 'a'], 'color': ['red', 'red', 'orange', 'yellow', 'yellow', 'orange', 'red', 'yellow', 'orange'], 'foods': ['apple', 'pepper', 'apple', 'apple', 'apple', 'pepper', 'pepper', 'apple', 'apple']})
b = df.groupby(['color', 'foods']).count().sort_values(['countingVariable', 'foods', 'color'], ascending = [False, False, False])

其中b看起来像这样：

               countingVariable
color  foods                   
yellow apple                  3
red    pepper                 2
orange apple                  2
       pepper                 1
red    apple                  1

但我希望它看起来像这样的输出：

               countingVariable
color  foods                   
yellow apple                  3
red    pepper                 2
       apple                  1
orange apple                  2
       pepper                 1

因此程序将找到最高计数，然后将其与所属组的其余部分一起放在顶部

Answer 1

需要在第0级上.reindex才能进行排序（按照计数最高的食物，然后在食物中降序）。之所以有效，是因为pd.unique个保留者订单。

import pandas as pd

b = b.reindex(b.index.unique(level=0), level=0)

输出：

               countingVariable
color  foods                   
yellow apple                  3
red    pepper                 2
       apple                  1
orange apple                  2
       pepper                 1

Answer 2

这很奇怪。您将初始输出显示为

print(b)
               countingVariable
color  foods                   
yellow apple                  3
red    pepper                 2
orange apple                  2
       pepper                 1
red    apple                  1

但是，当我使用您的确切代码时，会得到不同的输出

df = pd.DataFrame({
  'countingVariable': ['a', 'a', 'a', 'a', 'a', 'a', 'a', 'a', 'a'],
  'color': ['red', 'red', 'orange', 'yellow', 'yellow', 'orange', 'orange', 'yellow', 'orange'],
  'foods': ['apple', 'pepper', 'apple', 'apple', 'apple', 'pepper', 'pepper', 'apple', 'apple']
    })
b = df.groupby(['color', 'foods']).count().sort_values(['countingVariable', 'foods', 'color'],
               ascending = [False, False, False])

print(b)
               countingVariable
color  foods                   
yellow apple                  3
orange pepper                 2
       apple                  2
red    pepper                 1
       apple                  1

这似乎是您真正想要的输出。

编辑

也许您发布的数据与您实际使用的数据有所不同？

Answer 3

这应该可以解决问题：

df.groupby(['color', 'foods']).count().sort_values('countingVariable', ascending=False)

输出：

               countingVariable
color  foods                   
yellow apple                  3
orange apple                  2
       pepper                 2
red    apple                  1
       pepper                 1

如何进行“多索引”分组

3 个答案:

输出：