Question

查看表格设置的图片。 original table 我想通过jupyter笔记本中的styleno，颜色和大小来总结每个商店的总订单数量。如下。

NO. STYLE   STORE   COLOR   UNITS                       TOTAL
                    M   L   XL  2XL     
1   JIL25011    16  NAVY    2   2                   2
    JIL25012    16  NUDE    3   3                   3
    JIL25013    16  WHITE   3   3                   3
    JIL25012    16  BLACK   6       2   2   2       6
    JIL25012    16  NUDE    4           2   2       4
2   JIL25013    17  NUDE    3   3                   3
3   JIL25011    18  WHITE   4   2   2               4
    JIL90008    18  WHITE   3   3                   3
4   JIL25011    52  BLACK   2   2                   2

使用下面的代码。

df1 = pd.pivot_table(df, values=['Store16','Store17','Store18','Store52','Store53','Store59','Store60','Store61','Store62','Store63','Store64','Store65','Store68','Store70','Store72','Store74','Store75'],index=['StyleNo','Color','Size'],aggfunc=np.sum)

得到这样的结果：

post pivot table

如何正确转动？

Answer 1

下次请复制并粘贴您的数据，而不是使用图片。无法访问实际或样本数据，我必须创建自己的数据;但是这应该引导你朝着正确的方向前进：

import pandas as pd
import numpy as np

df = pd.DataFrame({'style no.':['foo1','foo1','foo2'],'color':['black','black','blue'],
                   'units':['S','M','L'],'store_1':[5,10,15],'store_2':[0,2,3],'store_3':[1,10,0]},
                    columns=['style no.','color','units','store_1','store_2','store_3'])

df1 = df.melt(id_vars=['style no.', 'color','units'],
                       value_vars=['store_1', 'store_2','store_3'], 
                       var_name='store', value_name='total')

df2 = df1.sort_values(by=['style no.','color'])

df3 = df2.pivot_table(values='total', index=['style no.', 'color','store'],
                      columns='units', aggfunc='first')

df3['total'] = np.sum(df3,axis=1)

df3.replace(np.nan,0.0)

出：

                    units   L      M    S     total
style no.   color   store               
foo1        black   store_1 0.0   10.0  5.0   15.0
                    store_2 0.0   2.0   0.0   2.0
                    store_3 0.0   10.0  1.0   11.0
foo2        blue    store_1 15.0  0.0   0.0   15.0
                    store_2 3.0   0.0   0.0   3.0
                    store_3 0.0   0.0   0.0   0.0

如果需要，您可以随时重置索引：

替换

df3.replace(np.nan,0.0)

与

df3.reset_index().replace(np.nan,0.0)

如何透视表，以便列也可以成为索引

1 个答案: