熊猫cut()创建更少的垃圾箱

时间:2020-03-21 07:49:27

标签: pandas

我正在尝试使用pd.cut()通过以下代码创建3个垃圾箱:

cut_bins= [0,139.99,199.99,250]
cut_labels = ['Set1', 'Set2', 'Set3']
pima1['G_set'] = pd.cut(pima1['G'],bins=cut_bins,labels=cut_labels)
print(pima1['G_set'].unique())

但是输出只给了我2个垃圾箱:

[Set1, Set2, NaN]
Categories (2, object): [Set1 < Set2]

代码是否有问题?

谢谢!

1 个答案:

答案 0 :(得分:0)

我认为这取决于数据,NaN表示容器外的数据:

cut_bins= [0,139.99,199.99,250]
cut_labels = ['Set1', 'Set2', 'Set3']

pima1 = pd.DataFrame({'G':[150, 200, 300]})

pima1['G_set'] = pd.cut(pima1['G'],bins=cut_bins,labels=cut_labels)
print(pima1['G_set'])
0    Set2
1    Set3
2     NaN
Name: G_set, dtype: category
Categories (3, object): [Set1 < Set2 < Set3]

或者:

pima1 = pd.DataFrame({'G':[0, 150, 200]})

pima1['G_set'] = pd.cut(pima1['G'],bins=cut_bins,labels=cut_labels)
print(pima1['G_set'])
0     NaN
1    Set2
2    Set3
Name: G_set, dtype: category
Categories (3, object): [Set1 < Set2 < Set3]

如果需要在箱中使用0,请添加include_lowest=True参数:

pima1['G_set'] = pd.cut(pima1['G'],bins=cut_bins,labels=cut_labels, include_lowest=True)
print(pima1['G_set'])
0    Set1
1    Set2
2    Set3
Name: G_set, dtype: category
Categories (3, object): [Set1 < Set2 < Set3]