如何用Pandas qcut计算桶的值?

时间:2016-07-21 11:58:12

标签: python pandas buckets discretization quartile

我正在使用Pandas的qcut为机器学习算法准备好我的数据。我有价格的产品,我用这段代码将我的数据离散化为相同大小的桶:

df['PriceBucket'] = pd.qcut(df['sell_prix'].sort_values(), 10, labels=False)

此代码有关于我的标签的更多详细信息:

df['PriceBucketTitle'] = pd.qcut(df['sell_prix'].sort_values(), 10)

如下所示,我有PriceBucket和PriceBucketTitle,它很完美!现在,我想要考虑多少元素。此代码返回NaN值(如下所示):

df['products_by_number'] = pd.qcut(df['sell_prix'], 10, labels=False).value_counts()

我知道如果我通过PriceBucket做一个群体可能是可行的,但我想保留我的数据格式。 这是结果:

      sell_prix PriceBucket PriceBucketTitle    products_by_number
4668    8.0          2         (6.5, 8.5]            NaN
4669    8.0          2         (6.5, 8.5]            NaN
4670    8.0          2         (6.5, 8.5]            NaN
4671    8.0          2         (6.5, 8.5]            NaN
4672    8.0          2         (6.5, 8.5]            NaN
4673    8.0          2         (6.5, 8.5]            NaN
4674    8.0          2         (6.5, 8.5]            NaN
4675    8.0          2         (6.5, 8.5]            NaN
4676    8.0          2         (6.5, 8.5]            NaN
4677    8.0          2         (6.5, 8.5]            NaN
11902   15.0         5         (12.9, 15]            NaN
11903   15.0         5         (12.9, 15]            NaN
11904   15.0         5         (12.9, 15]            NaN
11905   15.0         5         (12.9, 15]            NaN
11906   15.0         5         (12.9, 15]            NaN
11907   15.0         5         (12.9, 15]            NaN
11908   15.0         5         (12.9, 15]            NaN
11909   15.0         5         (12.9, 15]            NaN
11910   15.0         5         (12.9, 15]            NaN
11911   15.0         5         (12.9, 15]            NaN
12065   11.0         4         (10, 12.9]            NaN
12066   11.0         4         (10, 12.9]            NaN

例如,这就是我想要的:

      sell_prix PriceBucket PriceBucketTitle    products_by_number
4668    8.0          2         (6.5, 8.5]            984546.0
4669    8.0          2         (6.5, 8.5]            984546.0
4670    8.0          2         (6.5, 8.5]            984546.0
4671    8.0          2         (6.5, 8.5]            984546.0
4672    8.0          2         (6.5, 8.5]            984546.0
4673    8.0          2         (6.5, 8.5]            984546.0
4674    8.0          2         (6.5, 8.5]            984546.0
4675    8.0          2         (6.5, 8.5]            984546.0
4676    8.0          2         (6.5, 8.5]            984546.0
4677    8.0          2         (6.5, 8.5]            984546.0
11902   15.0         5         (12.9, 15]            1028141.0
11903   15.0         5         (12.9, 15]            1028141.0
11904   15.0         5         (12.9, 15]            1028141.0
11905   15.0         5         (12.9, 15]            1028141.0
11906   15.0         5         (12.9, 15]            1028141.0
11907   15.0         5         (12.9, 15]            1028141.0
11908   15.0         5         (12.9, 15]            1028141.0
11909   15.0         5         (12.9, 15]            1028141.0
11910   15.0         5         (12.9, 15]            1028141.0
11911   15.0         5         (12.9, 15]            1028141.0
12065   11.0         4         (10, 12.9]            48998.0
12066   11.0         4         (10, 12.9]            48998.0

帮助? Thanx!

0 个答案:

没有答案