使用Numpy按组计算百分等级

时间:2018-10-01 20:33:00

标签: numpy arcpy percentile

我是Python的新手,我想按组计算百分位排名。我的小组是野生动物管理单位(WMU-字符串),其排名基于预测的驼鹿密度(PMDEN3-FLOAT)的值。等级值进入字段RankMD。

我的方法是使用for循环来计算每个WMU中的3个等级,但是结果是为整个dbf文件(大约23,000条记录)创建了3个等级,而与WMU无关。非常感谢您的帮助。

import arcpy
import numpy as np

input = r'K:\Moose\KrigStratPython\TestRank3.dbf' 
arr = arcpy.da.TableToNumPyArray(input, ('PMDEN3', 'Wmu'))
c_arr = [float(x[0]) for x in np.ndarray.flatten(arr)]

for Wmu in arr:
##to create 3 rank for example
    p1 = np.percentile(c_arr, 33)  # rank = 0
    p2 = np.percentile(c_arr, 67)  # rank = 1
    p3 = np.percentile(c_arr, 100)  # rank = 2

#use cursor to update the new rank field
    with arcpy.da.UpdateCursor(input , ['PMDEN3','RankMD']) as cursor:
        for row in cursor:
            if row[0] < p1:
                row[1] = 0  #rank 0
            elif p1 <= row[0] and row[0] < p2:
                 row[1] = 1
            else:
                 row[1] = 2

            cursor.updateRow(row)

2 个答案:

答案 0 :(得分:0)

您的for循环是正确的,但是,您的UpdateCursor遍历了表中的所有行。为了获得所需的结果,您需要选择表的一个子集,然后在该表上使用更新光标。您可以通过向UpdateCursor function的where_clause参数传递查询来实现。

所以您将有这样的查询:

current_wmu = WMU['wmu']  # This should be the value of the wmu that the for loop is currently on I think it would be WMU['wmu'] but i'm not positive
where_clause = "WMU = '{}'".format(current_wmu)  # format the above variable into a query string

,那么您的UpdateCursor现在将是:

with arcpy.da.UpdateCursor(input , ['PMDEN3','RankMD'], where_clause) as cursor:

答案 1 :(得分:0)

根据BigGerman的建议,我修改了代码,现在可以正常工作了。脚本循环遍历每个WMU值,并根据PMDEN计算每个组内的等级百分位。为了改进脚本,我应该从输入文件中创建WMU值数组,而不是手动创建数组。

import arcpy
import numpy as np

#fields to be calculated
fldPMDEN = "PMDEN"
fldRankWMU = "RankWMU"

input = r'K:\Moose\KrigStratPython\TestRank3.dbf' 
arcpy.MakeFeatureLayer_management(input, "stratLayerShpNoNullsLyr")
WMUs = ["10", "11A", "11B", "11Q", "12A"]
for current_wmu in WMUs:
    ##to create 3 rank for example
        where_clause = "Wmu = '{}'".format(current_wmu)  # format the above variable into a query
        with arcpy.da.UpdateCursor("stratLayerShpNoNullsLyr", [fldPMDEN,fldRankWMU], where_clause) as cursor:
            arr1 = arcpy.da.TableToNumPyArray("stratLayerShpNoNullsLyr", [fldPMDEN,fldRankWMU], where_clause)
            c_arrS = [float(x[0]) for x in np.ndarray.flatten(arr1)]
            p1 = np.percentile(c_arrS, 33)  # rank = 3
            p2 = np.percentile(c_arrS, 67)  # rank = 2
            p3 = np.percentile(c_arrS, 100)  # rank = 1 (highest density)
            for row in cursor:
                if row[0] < p1:
                    row[1] = 3  #rank 0
                elif p1 <= row[0] and row[0] < p2:
                     row[1] = 2
                else:
                     row[1] = 1
                cursor.updateRow(row)