首先,我意识到这已经被覆盖过了。第二,我的编程经验非常有限。然而,由于我很惊讶地看到这种性能的提高,并且因为我也有相对良好的基准测试,我认为我会分享结果。
基本上我正在做的是使用Python(IronPython)中的脚本在Rhinoceros上的网格上进行k-means聚类。 Here是结果的屏幕截图。而bellow是我脚本中所有函数的基准。
CYCLE 4
GetProxy = 20.0
GetProxySeed = 10.0
BuildQueue = 0.0
AssignToRegion = 610.0009
SplitRegion = 10.0
RenumberSplitRegions = 30.0
FindAdjacentRegions = 10.0
FindRegionsToCombine = 130.0002
ReCombineRegions = 0.0
RenumberRegions = 0.0
ReGrowSeeds = 110.0002
CYCLE 4 TOOK = 940.0013
AssignToRegionTime Average over all cycles (20)= 613.8351
可以清楚地看到AssignToRegion是瓶颈。这是将网格中的每个面分配给集群的功能。贝娄是所涉及步骤的“伪代码”:
It starts with a queue of structure [ [ error value, region index tag, face index] , ...]
equal in length to the number of regions. Then:
1. Pop item of least error.
2. Check if item has already been assigned to a region.
3. If not, append to the region of same tag.
4. Get adjacent faces minus faces already assigned to a region.
5. Update queue with new faces.
5.1. Get error of new faces. This basically compared the face's normal vector to the region's normal vector. The greater the
difference, the greater the error.
5.2. Add tag of last popped face.
5.3. Sort queue.
Repeat until queue is empty.
我只是通过更改队列中的每个列表 [[错误值,区域索引标记,面部索引],...]来获得 x2性能提升 <> ,成元组 [(错误值,区域索引标签,面部索引),...],虽然我不太明白为什么这种变化如此戏剧性。
CYCLE 4
GetProxy = 21.0012
GetProxySeed = 8.0004
BuildQueue = 1.0001
AssignToRegion = 245.014
SplitRegion = 5.0003
RenumberSplitRegions = 32.0019
FindAdjacentRegions = 6.0004
FindRegionsToCombine = 114.0065
ReCombineRegions = 0.0
RenumberRegions = 0.0
ReGrowSeeds = 99.0057
CYCLE 4 TOOK = 553.0317
AssignToRegionTime Average over all cycles (20)= 275.21574
对于那些好奇的人来说,波纹管是整个功能的代码。上述变化即将结束。
其他可能会加快速度的建议非常受欢迎!
def AssignToRegion(faceNormals, vertices, faceVertexIndexes, areaFaces, adjacentFaces, regions, queue, assignedIndexes):
## queue = [ [error, regionIndex, index], ... ]
## OR queue= [ (error, regionIndex, index), ... ]
## Container list for the items popped from the priority list.
assignedIndexes = set(assignedIndexes)
## Until the priority queue is not empty, keep popping
## the item with least priority from the priority queue.
while queue :
mostPriority = queue.pop()
faceIndex = mostPriority[2]
## If the index of the popped face has already
## been assigned skip to the next one.
if faceIndex not in assignedIndexes:
regionIndex = mostPriority[1] ## regionIndex is Int
regions[regionIndex][1].append(faceIndex)
assignedIndexes.add(faceIndex)
## Get the adjacent faces of the popped face
## and append them to the priority queue.
newAdjacentFaces = set(adjacentFaces[faceIndex])
## If an adjacent face has already been assigned
## to a region, skip it.
newAdjacentFaces -= assignedIndexes
## Append faces to priority queue.
queue = UpdateQueue( regions[ regionIndex] , faceNormals, areaFaces, queue, newAdjacentFaces)
return (regions)
def UpdateQueue( region, faceNormals, areaFaces, queue, newFaces ):
regionIndex = region[0]
proxyNormal = region[4]
newFacesErrors = MetricError(regionIndex, newFaces, faceNormals, areaFaces, proxyNormal)
queue.extend( newFacesErrors )
## This appears faster thatn queue.sort
queue = sorted(queue, reverse = True)
## This returns a list queue = [ [error, regionIndex, index] , [...] ]
return queue
def MetricError(regionIndex, faceIndexes, faceNormals, areaFaces, proxyNormal):
errors = []
for index in faceIndexes:
area = areaFaces[index]
normal = faceNormals[index]
normalError = normal - proxyNormal
## This can be obtained as an attribute of the vector
## and it's much faster.
# moduleNormalError = (normalError[0] ** 2 + normalError[1] ** 2 + normalError[2] ** 2)
moduleNormalError = normalError.SquareLength
error = moduleNormalError * area
errors.append( [error, regionIndex, index] )
## OR
errors.append( (error, regionIndex, index) )
## This returns a list errors = [ [error, regionIndex, index], ... ]
## OR errors = [ (error, regionIndex, index), ... ]
return errors
根据@ user2357112的建议,使用heapq我得到:
AssignToRegionTime average of 20 cycles = 135.80836