更新

Question

>>> df
   A   B   C      D
0  foo one small  1
1  foo one large  2
2  foo one large  2
3  foo two small  3
4  foo two small  3
5  bar one large  4
6  bar one small  5
7  bar two small  6
8  bar two large  7
>>> table = pivot_table(df, values='D', index=['A', 'B'],
...                     columns=['C'], aggfunc=np.sum)
>>> table
          small  large
foo  one  1      4
     two  6      NaN
bar  one  5      4
     two  6      7

我希望输出如上所示，但我得到一个排序输出。酒吧高于foo等等。

Answer 1

我认为pivot_table没有排序选项，但是groupby有：

df.groupby(['A', 'B', 'C'], sort=False)['D'].sum().unstack('C')
Out: 
C        small  large
A   B                
foo one    1.0    4.0
    two    6.0    NaN
bar one    5.0    4.0
    two    6.0    7.0

您将分组列传递给groupby，对于要显示为列值的列，您可以使用unstack。

如果您不想要索引名称，请将它们重命名为None：

df.groupby(['A', 'B', 'C'], sort=False)['D'].sum().rename_axis([None, None, None]).unstack(level=2)
Out: 
         small  large
foo one    1.0    4.0
    two    6.0    NaN
bar one    5.0    4.0
    two    6.0    7.0

Answer 2

创建pivot_table时，索引按字母顺序自动排序。不仅foo和bar，您还可能会注意到small和large已排序。如果您希望foo位于最前方，则可能需要使用sort再次sortlevel他们。如果您期望example here中的输出，则可能需要对A和C进行排序。

table.sortlevel(["A","B"], ascending= [False,True], sort_remaining=False, inplace=True)
table.sortlevel(["C"], axis=1, ascending=False,  sort_remaining=False, inplace=True)
print(table)

输出：

C        small  large
A   B                
foo one  1.0    4.0  
    two  6.0    NaN   
bar one  5.0    4.0  
    two  6.0    7.0

更新

删除索引名称A，B和C：

table.columns.name = None
table.index.names = (None, None)

Pandas pivot_table保留顺序

2 个答案:

更新