如何使用pandas.cut来分组我的数据?

时间:2014-04-30 01:07:27

标签: python pandas

raw_data='''
82   68   86   94   89   63   77   76   84    89
75   78   81   82   76   99   80   84   89    88
60   83   72   83   85   56   86   68   75   100
90   84   75   86   74   77   95   63   80    76
100   43  76   81   79   74   96   52   69    86'''

如何使用pandas.cut对数据进行分组并以下列格式将其输出为pandas.DataFrame?

  interval numbers
1   (0,60]       4
2  (60,70]       5
3  (70,80]      16
4  (80,90]      19
5 (90,100]       6

1 个答案:

答案 0 :(得分:4)

您可以cut然后致电describe

>>> nums = pd.Series(raw_data.split(), dtype=int)
>>> ncut = pd.cut(nums, [0, 60, 70, 80, 90, 100])
>>> d = ncut.describe()
>>> d
           counts  freqs
levels                  
(0, 60]         4   0.08
(60, 70]        5   0.10
(70, 80]       16   0.32
(80, 90]       19   0.38
(90, 100]       6   0.12

[5 rows x 2 columns]

或者,如果你非常特别:

>>> d = d.reset_index().drop("freqs", axis=1)
>>> d = d.rename(columns={"levels": "interval", "counts": "numbers"})
>>> d
    interval  numbers
0    (0, 60]        4
1   (60, 70]        5
2   (70, 80]       16
3   (80, 90]       19
4  (90, 100]        6

[5 rows x 2 columns]