我是Python的新手,我想按组计算百分位排名。我的小组是野生动物管理单位(WMU-字符串),其排名基于预测的驼鹿密度(PMDEN3-FLOAT)的值。等级值进入字段RankMD。
我的方法是使用for循环来计算每个WMU中的3个等级,但是结果是为整个dbf文件(大约23,000条记录)创建了3个等级,而与WMU无关。非常感谢您的帮助。
import arcpy
import numpy as np
input = r'K:\Moose\KrigStratPython\TestRank3.dbf'
arr = arcpy.da.TableToNumPyArray(input, ('PMDEN3', 'Wmu'))
c_arr = [float(x[0]) for x in np.ndarray.flatten(arr)]
for Wmu in arr:
##to create 3 rank for example
p1 = np.percentile(c_arr, 33) # rank = 0
p2 = np.percentile(c_arr, 67) # rank = 1
p3 = np.percentile(c_arr, 100) # rank = 2
#use cursor to update the new rank field
with arcpy.da.UpdateCursor(input , ['PMDEN3','RankMD']) as cursor:
for row in cursor:
if row[0] < p1:
row[1] = 0 #rank 0
elif p1 <= row[0] and row[0] < p2:
row[1] = 1
else:
row[1] = 2
cursor.updateRow(row)
答案 0 :(得分:0)
您的for循环是正确的,但是,您的UpdateCursor遍历了表中的所有行。为了获得所需的结果,您需要选择表的一个子集,然后在该表上使用更新光标。您可以通过向UpdateCursor function的where_clause参数传递查询来实现。
所以您将有这样的查询:
current_wmu = WMU['wmu'] # This should be the value of the wmu that the for loop is currently on I think it would be WMU['wmu'] but i'm not positive
where_clause = "WMU = '{}'".format(current_wmu) # format the above variable into a query string
,那么您的UpdateCursor现在将是:
with arcpy.da.UpdateCursor(input , ['PMDEN3','RankMD'], where_clause) as cursor:
答案 1 :(得分:0)
根据BigGerman的建议,我修改了代码,现在可以正常工作了。脚本循环遍历每个WMU值,并根据PMDEN计算每个组内的等级百分位。为了改进脚本,我应该从输入文件中创建WMU值数组,而不是手动创建数组。
import arcpy
import numpy as np
#fields to be calculated
fldPMDEN = "PMDEN"
fldRankWMU = "RankWMU"
input = r'K:\Moose\KrigStratPython\TestRank3.dbf'
arcpy.MakeFeatureLayer_management(input, "stratLayerShpNoNullsLyr")
WMUs = ["10", "11A", "11B", "11Q", "12A"]
for current_wmu in WMUs:
##to create 3 rank for example
where_clause = "Wmu = '{}'".format(current_wmu) # format the above variable into a query
with arcpy.da.UpdateCursor("stratLayerShpNoNullsLyr", [fldPMDEN,fldRankWMU], where_clause) as cursor:
arr1 = arcpy.da.TableToNumPyArray("stratLayerShpNoNullsLyr", [fldPMDEN,fldRankWMU], where_clause)
c_arrS = [float(x[0]) for x in np.ndarray.flatten(arr1)]
p1 = np.percentile(c_arrS, 33) # rank = 3
p2 = np.percentile(c_arrS, 67) # rank = 2
p3 = np.percentile(c_arrS, 100) # rank = 1 (highest density)
for row in cursor:
if row[0] < p1:
row[1] = 3 #rank 0
elif p1 <= row[0] and row[0] < p2:
row[1] = 2
else:
row[1] = 1
cursor.updateRow(row)