我有一个衡量欺诈的指标,例如电话,转移率,辅助时间等。
我已根据四分位数将它们分组到了垃圾箱中,现在我必须根据垃圾箱将评分从1到5进行分级。例如:> 150的呼叫将排名分配为1,将<= 150和> = 300的呼叫分配为2,依此类推。所有指标也是如此。
我尝试了以下操作:
Select t1.w_city, t1.all, t2.service_code from
(Select w_city
,count(w_city) as all
,row_number() over (order by count(w_city) desc) as rn
from customer
group by w_city) t1
left join
(select service_code
,row_number() over (order by count(service_code) desc) as rn
from customer
group by service_code) t2 on t2.rn = t1.rn
错误:
sudo /etc/init.d/elasticsearch restart
np.where(x.Calls<=125.8,1, np.where(x.Calls>=153.2 & x.Calls<=190.0,2,np.where(x.Calls>=190.0 & x.Calls<=235.0,3,np.where(x.Calls>=235.0 & x.Calls<=304.4,4,np.where(x.Calls>=304.4,5,0))))
File "<ipython-input-32-41fe2292e308>", line 2
np.where(x.Calls>=153.2 & x.Calls<=190.0,2,np.where(x.Calls>=190.0 &
x.Calls<=235.0,3,np.where(x.Calls>=235.0 &
我希望代码从得到的四分位数中取值范围,并对其进行评级。
答案 0 :(得分:0)
您的特定错误表明您未打开一些括号。
但是您会收到此错误,因为嵌套np.where
方法确实很难实现(因此很难调试和维护)。因此,值得考虑其他方式。
您对我要执行的规则并不完全清楚,但是我认为np.digitize
可能会帮助您取得进步。它“量化”您的数据:您给它一个类似于数组的bin,并返回每个出现数组的bin的bin。它的工作方式如下:
>>> import numpy as np
>>> a = np.array([55, 99, 65, 121, 189, 205, 211, 304, 999])
>>> bins = [100, 200, 300]
>>> np.digitize(a, bins=bins)
array([0, 0, 0, 1, 1, 2, 2, 3, 3])