我正在尝试使用pd.cut()通过以下代码创建3个垃圾箱:
cut_bins= [0,139.99,199.99,250]
cut_labels = ['Set1', 'Set2', 'Set3']
pima1['G_set'] = pd.cut(pima1['G'],bins=cut_bins,labels=cut_labels)
print(pima1['G_set'].unique())
但是输出只给了我2个垃圾箱:
[Set1, Set2, NaN]
Categories (2, object): [Set1 < Set2]
代码是否有问题?
谢谢!
答案 0 :(得分:0)
我认为这取决于数据,NaN
表示容器外的数据:
cut_bins= [0,139.99,199.99,250]
cut_labels = ['Set1', 'Set2', 'Set3']
pima1 = pd.DataFrame({'G':[150, 200, 300]})
pima1['G_set'] = pd.cut(pima1['G'],bins=cut_bins,labels=cut_labels)
print(pima1['G_set'])
0 Set2
1 Set3
2 NaN
Name: G_set, dtype: category
Categories (3, object): [Set1 < Set2 < Set3]
或者:
pima1 = pd.DataFrame({'G':[0, 150, 200]})
pima1['G_set'] = pd.cut(pima1['G'],bins=cut_bins,labels=cut_labels)
print(pima1['G_set'])
0 NaN
1 Set2
2 Set3
Name: G_set, dtype: category
Categories (3, object): [Set1 < Set2 < Set3]
如果需要在箱中使用0
,请添加include_lowest=True
参数:
pima1['G_set'] = pd.cut(pima1['G'],bins=cut_bins,labels=cut_labels, include_lowest=True)
print(pima1['G_set'])
0 Set1
1 Set2
2 Set3
Name: G_set, dtype: category
Categories (3, object): [Set1 < Set2 < Set3]