Margins=True
时, pd.grouper datetime
将无法在Pandas pivot_table中使用。这是我的代码按预期工作 -
p = df.pivot_table(values='Qty', index=['ItemCode', 'LineItem'],columns=pd.Grouper(key = 'Date', freq='W'), aggfunc=np.sum, fill_value=0)
但如果我添加margins=True
,那么我会得到一个小计,我会收到错误说:
KeyError:“[TimeGrouper(key ='in time',freq =,axis = 0,sort = True,closed ='left',label ='left',how ='mean',convention ='e' ,base = 0)]不在索引“
中
答案 0 :(得分:1)
这看起来很奇怪!我不知道是什么导致数据透视表使用TimeGrouper本身作为索引。这似乎是一个错误,但我不确定。无论如何,我认为数据透视表无法执行子索引页边距,因此这是使用groupby的解决方案:
样本数据
import pandas as pd
from random import randint, choice
from string import ascii_letters, ascii_lowercase
# Say we have a dataframe with 500 rows and 20 different items
df_len = range(500)
item_codes = [''.join([choice(ascii_letters) for _ in range(10)]) for __ in range(20)]
df = pd.DataFrame({
'ItemCode': [choice(item_codes) for __ in df_len],
'Date': [pd.datetime.today() - pd.Timedelta(randint(0, 28), 'D') for _ in df_len],
'Qty': [randint(1,10) for _ in df_len],
'LineItem': [choice(('a', 'b', 'c')) for _ in df_len],
})
df.head()
ItemCode Date Qty LineItem
0 IFaEmWGHTJ 2020-05-21 13:29:56.687412 8 a
1 jvLqoLfBcd 2020-05-23 13:29:56.687509 6 a
2 GOPFJEoSUm 2020-05-13 13:29:56.687550 1 a
3 qJqzzgDTaa 2020-05-03 13:29:56.687575 5 a
4 BCvRrgcpFD 2020-05-24 13:29:56.690114 8 b
解决方案
res = (df.groupby(['ItemCode', 'LineItem', pd.Grouper(key='Date', freq='W')])['Qty']
.count()
.unstack()
.fillna(0))
res.loc[('column_total', ''), :] = res.sum(axis=0)
res.loc[:,'row_total'] = res.sum(axis=1)
结果
| | 2020-05-03 | 2020-05-10 | 2020-05-17 | 2020-05-24 | 2020-05-31 | row_total |
|:---------------------|-------------:|-------------:|-------------:|-------------:|-------------:|------------:|
| ('CtdClujjRF', 'a') | 1 | 2 | 2 | 0 | 0 | 5 |
| ('CtdClujjRF', 'b') | 0 | 3 | 1 | 1 | 1 | 6 |
| ('CtdClujjRF', 'c') | 1 | 1 | 2 | 2 | 1 | 7 |
| ('DnQcEbHoVL', 'a') | 0 | 2 | 1 | 1 | 1 | 5 |
| ('DnQcEbHoVL', 'b') | 1 | 1 | 1 | 2 | 2 | 7 |
... ... ... ... ... ... ...
| ('sxFnkCcSJu', 'c') | 0 | 2 | 2 | 3 | 0 | 7 |
| ('vOaWNHgOgm', 'a') | 0 | 5 | 1 | 7 | 1 | 14 |
| ('vOaWNHgOgm', 'b') | 1 | 0 | 1 | 3 | 4 | 9 |
| ('vOaWNHgOgm', 'c') | 1 | 2 | 2 | 5 | 1 | 11 |
| ('column_total', '') | 64 | 128 | 115 | 127 | 66 | 500 |