Question

使用以下代码在数据框的4列上应用group by：

df = df_week.groupby(['Week/Year', 'host', 'table_type', 'table_name']).size()

我收到了这种形式的数据：

Week/Year        host        table_type       table_name
31/2017          rb          abc              qrst              10
31/2017          gb          def              abcd              17
31/2017          rb          abc              lmno              8
32/2017          rb          abc              qrst              7
32/2017          gb          def              abcd              1
32/2017          rb          def              lmno              5
32/2017          rb          abc              tuvw              20
33/2017          gb          abc              qrst              19
33/2017          rb          def              lmno              21

现在我想根据周/年列对最后一列中的计算计数进行排序，即在31/2017组中，计数列中的所有值都应该排序。

例如：

预期输出为：

Week/Year        host        table_type       table_name
31/2017          gb          def              abcd              17
31/2017          rb          abc              qrst              10
31/2017          rb          abc              lmno              8
32/2017          rb          abc              tuvw              20
32/2017          rb          abc              qrst              7
32/2017          rb          def              lmno              5
32/2017          gb          def              abcd              1
33/2017          rb          def              lmno              21
33/2017          gb          abc              qrst              19

Answer 1

在以下代码中，您可以按Week/year列按升序排序，按size列降序排序。

import pandas as pd

df = df.groupby(['Week/Year', 'host', 'table_type', 'table_name']).size().to_frame('size').reset_index()
df.sort_values(['Week/Year', 'size'], ascending=[True, False])

Answer 2

如果要将数据保留为Series对象，请对值进行排序，然后对索引进行排序：

df.sort_values(ascending = False).sort_index(level = 'Week/Year')

根据特定列python数据帧对特定列值进行排序

2 个答案: