如何根据频率对数据进行分级

时间:2018-05-21 04:25:48

标签: python pandas dataframe pivot

这是我的数据

id   data
1      89
2      54
3      45
4      67
5      78
6      80

这是我想要的输出类型

Interval    Count
45 - 54         2
67 - 78         2
80 - 89         2

我希望数据分布更均匀

1 个答案:

答案 0 :(得分:1)

pandas有一个名为qcut()的函数可以执行您想要的操作。只需传入data列:

即可
In []:
qc = pd.qcut(df['data'], q=3, precision=0)
qc

Out[]:
0    (79.0, 89.0]
1    (44.0, 63.0]
2    (44.0, 63.0]
3    (63.0, 79.0]
4    (63.0, 79.0]
5    (79.0, 89.0]
Name: data, dtype: category
Categories (3, interval[float64]): [(44.0, 63.0] < (63.0, 79.0] < (79.0, 89.0]]

您可以使用qc.value_counts()来获取计数:

In []:
qc.value_counts().sort_index()

Out[]:
(44.0, 63.0]    2
(63.0, 79.0]    2
(79.0, 89.0]    2
Name: data, dtype: int64