如何在按范围对数据进行分组时更改容器大小?

时间:2017-01-31 21:57:37

标签: python pandas

我的问题与the other question的解决方案有关。 我想知道如何将bin大小从3改为5或10或者其他什么。如果我更改step,那么这还不够。我也应该改变(str(int(cat[1:3])) + "-" + str(int(cat[5:7])-1),但这是我不能做的。我收到错误ValueError: invalid literal for int() with base 10: '18, '

step=3
kwargs = dict(include_lowest=True, right=False)
bins = pd.cut(df.AVG_PERCENT_EVAL_1, bins=np.arange(18,40+step,step), **kwargs)
labels = [(str(int(cat[1:3])) + "-" + str(int(cat[5:7])-1)) for cat in bins.cat.categories]
bins.cat.categories = labels

df = df.assign(AVG_PERCENT_RANGE=bins).drop("AVG_PERCENT_EVAL_1", axis=1)
df.groupby(['GROUP', 'AVG_PERCENT_RANGE'], as_index=False).agg('mean')

1 个答案:

答案 0 :(得分:1)

这是你想要的吗?

In [166]: %paste
step=5
kwargs = dict(include_lowest=True, right=False)
bins=np.arange(18,40+step,step)
labels = ['{}-{}'.format(i, i+step-1) for i in bins][:-1]

df['AVG_PERCENT_RANGE'] = pd.cut(df.pop('AVG_PERCENT_EVAL_1'),
                                 bins=bins, labels=labels, **kwargs)
df.groupby(['GROUP', 'AVG_PERCENT_RANGE'], as_index=False).agg('mean')
## -- End pasted text --
Out[166]:
   GROUP AVG_PERCENT_RANGE  AVG_PERCENT_NEGATIVE  AVG_TOTAL_WAIT_TIME  AVG_TOTAL_SERVICE_TIME
0  AAAAA             18-22              6.500000            85.682099              247.880659
1  AAAAA             23-27              0.833333           103.445112              314.336474
2  AAAAA             28-32                   NaN                  NaN                     NaN
3  AAAAA             33-37                   NaN                  NaN                     NaN
4  AAAAA             38-42                   NaN                  NaN                     NaN
5  BBBBB             18-22              0.777778            63.500619              242.510146
6  BBBBB             23-27              2.000000           103.796290              313.685358
7  BBBBB             28-32                   NaN                  NaN                     NaN
8  BBBBB             33-37                   NaN                  NaN                     NaN
9  BBBBB             38-42                   NaN                  NaN                     NaN