python pandas binning数值范围

时间:2016-07-10 22:22:44

标签: python pandas numeric binning

我有一个请求,我想要一个数字值

If the student marks is 
b/w 0-50 (incl 50) then assign the level column value = "L"
b/w 50-75(incl. 75) then assign the level column value ="M"
>75 then assign the level column value ="H"

这是我得到的

raw_data = {'student':['A','B','C'],'marks_maths':[75,90,99]}
df = pd.DataFrame(raw_data, columns = ['student','marks_maths'])
bins = [0,50,75,>75]
groups = ['L','M','H']
df['maths_level'] = pd.cut(df['marks_maths'], bins, labels=groups)

我收到语法错误

File "<ipython-input-25-f0b9dd609c63>", line 3
    bins = [0,50,75,>75]
                    ^
SyntaxError: invalid syntax

如何引用说明&gt;某个值的截止值?

3 个答案:

答案 0 :(得分:2)

试试这个:

 bins = [0,50,75,101] or bins = [0,50,75,np.inf]

答案 1 :(得分:1)

希望这有帮助

import numpy as np
import pandas as pd

# 20 random numbers between 0 and 100
scores = np.random.randint(0,100,20)
df = pd.DataFrame(scores, columns=['scores'])

bins = [0,50,75, np.inf]

df['binned_scores'] = pd.cut(df.scores, bins=[0,50,75, np.inf], include_lowest=False, right=True)
df['bin_labels'] = pd.cut(df.scores, bins=[0,50,75, np.inf], include_lowest=False, right=True, labels=['L','M','H'])

include_lowestright参数可让您控制您的垃圾箱是否可以控制边缘是否包容。

答案 2 :(得分:0)

只需将上限定义为最佳标记:

bins = [0, 50, 75, 100]

结果如您所愿:

  student  marks_maths maths_level
0       A           75           M
1       B           90           H
2       C           99           H