我创建了一个python程序,它使用了函数" CostPath" ArcGIS自动构建shapefile" selected_patches.shp"中包含的多个多边形之间的最低成本路径(LCP)。我的python程序似乎工作但它太慢了。我必须建立275493 LCP。不幸的是,我不知道如何加速我的程序(我是Python编程语言和ArcGIS的初学者)。或者是否有另一种解决方案可以使用ArcGIS快速计算多个多边形之间的最低成本路径(我使用的是ArcGIS 10.1)?这是我的代码:
# Import system modules
import arcpy
from arcpy import env
from arcpy.sa import *
arcpy.CheckOutExtension("Spatial")
# Overwrite outputs
arcpy.env.overwriteOutput = True
# Set the workspace
arcpy.env.workspace = "C:\Users\LCP"
# Set the extent environment
arcpy.env.extent = "costs.tif"
rowsInPatches_start = arcpy.SearchCursor("selected_patches.shp")
for rowStart in rowsInPatches_start:
ID_patch_start = rowStart.getValue("GRIDCODE")
expressionForSelectInPatches_start = "GRIDCODE=%s" % (ID_patch_start) ## Define SQL expression for the fonction Select Layer By Attribute
# Process: Select Layer By Attribute in Patches_start
arcpy.MakeFeatureLayer_management("selected_patches.shp", "Selected_patch_start", expressionForSelectInPatches_start)
# Process: Cost Distance
outCostDist=CostDistance("Selected_patch_start", "costs.tif", "", "outCostLink.tif")
# Save the output
outCostDist.save("outCostDist.tif")
rowsInSelectedPatches_end = arcpy.SearchCursor("selected_patches.shp")
for rowEnd in rowsInSelectedPatches_end:
ID_patch_end = rowEnd.getValue("GRIDCODE")
expressionForSelectInPatches_end = "GRIDCODE=%s" % (ID_patch_end) ## Define SQL expression for the fonction Select Layer By Attribute
# Process: Select Layer By Attribute in Patches_end
arcpy.MakeFeatureLayer_management("selected_patches.shp", "Selected_patch_end", expressionForSelectInPatches_end)
# Process: Cost Path
outCostPath = CostPath("Selected_patch_end", "outCostDist.tif", "outCostLink.tif", "EACH_ZONE","FID")
# Save the output
outCostPath.save('P_' + str(int(ID_patch_start)) + '_' + str(int(ID_patch_end)) + ".tif")
# Writing in file .txt
outfile=open('P_' + str(int(ID_patch_start)) + '_' + str(int(ID_patch_end)) + ".txt", "w")
rowsTxt = arcpy.SearchCursor('P_' + str(int(ID_patch_start)) + '_' + str(int(ID_patch_end)) + ".tif")
for rowTxt in rowsTxt:
value = rowTxt.getValue("Value")
count = rowTxt.getValue("Count")
pathcost = rowTxt.getValue("PATHCOST")
startrow = rowTxt.getValue("STARTROW")
startcol = rowTxt.getValue("STARTCOL")
print value, count, pathcost, startrow, startcol
outfile.write(str(value) + " " + str(count) + " " + str(pathcost) + " " + str(startrow) + " " + str(startcol) + "\n")
outfile.close()
非常感谢你的帮助。
答案 0 :(得分:1)
写入光盘与计算成本所需的速度可能是瓶颈,考虑添加一个线程来处理所有写入。
此:
for rowTxt in rowsTxt:
value = rowTxt.getValue("Value")
count = rowTxt.getValue("Count")
pathcost = rowTxt.getValue("PATHCOST")
startrow = rowTxt.getValue("STARTROW")
startcol = rowTxt.getValue("STARTCOL")
print value, count, pathcost, startrow, startcol
outfile.write(str(value) + " " + str(count) + " " + str(pathcost) + " " + str(startrow) + " " + str(startcol) + "\n")
可以通过使rowsTxt成为全局变量并让您的线程从rowsTxt写入磁盘来转换为线程函数。 在完成所有处理之后,你可以有一个额外的全局布尔值,这样你的线程函数就可以在写完所有内容后结束,你可以关闭你的线程。
我目前使用的示例线程函数:
import threading
class ThreadExample:
def __init__(self):
self.receiveThread = None
def startRXThread(self):
self.receiveThread = threading.Thread(target = self.receive)
self.receiveThread.start()
def stopRXThread(self):
if self.receiveThread is not None:
self.receiveThread.__Thread__stop()
self.receiveThread.join()
self.receiveThread = None
def receive(self):
while true:
#do stuff for the life of the thread
#in my case, I listen on a socket for data
#and write it out
因此,对于您的情况,您可以将一个类变量添加到线程类
self.rowsTxt
然后更新你的接收以检查self.rowsTxt,如果它不为空,请像我在上面提到的代码片段中那样处理它。处理完毕后,将self.rowsTxt设置回None。您可以使用main函数更新线程self.rowsTxt,因为它获取rowsTxt。考虑为self.rowsTxt使用类似缓冲区的列表,这样你就不会错过任何写作。
答案 1 :(得分:1)
您可以通过切换到data access cursors(例如arcpy.da.SearchCursor()
)来实现最快速的改进。为了说明这一点,我在一段时间后进行了基准测试,看看数据访问游标的性能与旧游标相比。
附图显示了新da方法UpdateCursor与旧UpdateCursor方法的基准测试结果。基本上,基准测试执行以下工作流程:
import arcpy, os, numpy, time
arcpy.env.overwriteOutput = True
outws = r'C:\temp'
fc = os.path.join(outws, 'randomPoints.shp')
iterations = [10, 100, 1000, 10000, 100000]
old = []
new = []
meanOld = []
meanNew = []
for x in iterations:
arcpy.CreateRandomPoints_management(outws, 'randomPoints', '', '', x)
arcpy.AddField_management(fc, 'randFloat', 'FLOAT')
for y in range(5):
# Old method ArcGIS 10.0 and earlier
start = time.clock()
rows = arcpy.UpdateCursor(fc)
for row in rows:
# generate random float from normal distribution
s = float(numpy.random.normal(100, 10, 1))
row.randFloat = s
rows.updateRow(row)
del row, rows
end = time.clock()
total = end - start
old.append(total)
del start, end, total
# New method 10.1 and later
start = time.clock()
with arcpy.da.UpdateCursor(fc, ['randFloat']) as cursor:
for row in cursor:
# generate random float from normal distribution
s = float(numpy.random.normal(100, 10, 1))
row[0] = s
cursor.updateRow(row)
end = time.clock()
total = end - start
new.append(total)
del start, end, total
meanOld.append(round(numpy.mean(old),4))
meanNew.append(round(numpy.mean(new),4))
#######################
# plot the results
import matplotlib.pyplot as plt
plt.plot(iterations, meanNew, label = 'New (da)')
plt.plot(iterations, meanOld, label = 'Old')
plt.title('arcpy.da.UpdateCursor -vs- arcpy.UpdateCursor')
plt.xlabel('Random Points')
plt.ylabel('Time (minutes)')
plt.legend(loc = 2)
plt.show()