Question

我有一个衡量欺诈的指标，例如电话，转移率，辅助时间等。

我已根据四分位数将它们分组到了垃圾箱中，现在我必须根据垃圾箱将评分从1到5进行分级。例如：> 150的呼叫将排名分配为1，将<= 150和> = 300的呼叫分配为2，依此类推。所有指标也是如此。

我尝试了以下操作：

Select t1.w_city, t1.all, t2.service_code from 
    (Select w_city
       ,count(w_city) as all   
       ,row_number() over (order by count(w_city) desc) as rn
     from customer
     group by w_city) t1
left join
     (select service_code
        ,row_number() over (order by count(service_code) desc) as rn 
      from customer
      group by service_code) t2 on t2.rn = t1.rn

错误：

sudo /etc/init.d/elasticsearch restart   np.where(x.Calls<=125.8,1, np.where(x.Calls>=153.2 & x.Calls<=190.0,2,np.where(x.Calls>=190.0 & x.Calls<=235.0,3,np.where(x.Calls>=235.0 & x.Calls<=304.4,4,np.where(x.Calls>=304.4,5,0))))   File "<ipython-input-32-41fe2292e308>", line 2   np.where(x.Calls>=153.2 & x.Calls<=190.0,2,np.where(x.Calls>=190.0 &

x.Calls<=235.0,3,np.where(x.Calls>=235.0 &

我希望代码从得到的四分位数中取值范围，并对其进行评级。

Answer 1

您的特定错误表明您未打开一些括号。

但是您会收到此错误，因为嵌套np.where方法确实很难实现（因此很难调试和维护）。因此，值得考虑其他方式。

您对我要执行的规则并不完全清楚，但是我认为np.digitize可能会帮助您取得进步。它“量化”您的数据：您给它一个类似于数组的bin，并返回每个出现数组的bin的bin。它的工作方式如下：

>>> import numpy as np
>>> a = np.array([55, 99, 65, 121, 189, 205, 211, 304, 999])
>>> bins = [100, 200, 300]
>>> np.digitize(a, bins=bins)
array([0, 0, 0, 1, 1, 2, 2, 3, 3])

python中的风险评分

1 个答案: