Question

我正在为基于稀疏体素八叉树的STL模型进行体素化。我有一个带有大约100,000个三角形的3D数组（每个三角形都有3个点，每个点的x，y，z值）。

我需要为每个三角形计算一些点和值。我用标准的CPython编写了该算法，运行了3秒钟，这对我来说很慢（大约0.5 s很好）。

我改用PyPy作为JIT编译器，并且表现非常出色。如果我为每个三角形运行不带点/值集合A的代码（请参见代码）（计算边界框和法线），则其性能比CPython高出10倍或更多；如果我不带点集B的代码运行代码，其性能优于CPython。好。当我同时运行两个算法集时，它的运行速度比CPYthon（或Set A独奏+ Set B独奏组合）慢得多。

您是否知道问题出在哪里？我认为这可能是内存问题。我在vm选项中为pypy分配了更多的堆内存，但是没有用。在循环迭代（使用“ del”）无效后，我删除了所有变量。

我在基于x64的计算机上使用Windows 10家庭版10.0.18363。

我在PyCharm Community Edition 2020.2.2中使用pypy3.6-v7.3.2rc1-win32

这些是Vm选项：

-Xmx2048m
-Xms750m 
-XX:ReservedCodeCacheSize=240m
-XX:+UseConcMarkSweepGC
-XX:SoftRefLRUPolicyMSPerMB=50
-ea
-XX:CICompilerCount=2
-Dsun.io.useCanonPrefixCache=false
-Djdk.http.auth.tunneling.disabledSchemes=""
-XX:+HeapDumpOnOutOfMemoryError
-XX:-OmitStackTraceInFastThrow
-Djdk.attach.allowAttachSelf=true
-Dkotlinx.coroutines.debug=off
-Djdk.module.illegalAccess.silent=true

非常感谢您的每条建议。

import trimesh as tr
import sys
import datetime

def doT(a,b):
    res=a[0]*b[0]+a[1]*b[1]+a[2]*b[2]
    return res

def doT2(a,b):
    res=a[0]*b[0]+a[1]*b[1]
    return res

def minus (a,b):
    res=[a[0]-b[0],a[1]-b[1],a[2]-b[2]]
    return res

def plus (a,b):
    res = [a[0] + b[0], a[1] + b[1], a[2] + b[2]]
    return res
def crosS(a,b):
    res=[a[1]*b[2]-a[2]*b[1],a[2]*b[0]-a[0]*b[2],a[0]*b[1]-a[1]-b[0]]
    return res

def miN(a,b):
    if a<b:
        return a
    else:
        return b

def maX(a,b):
    if a>b:
        return a
    else:
        return b

trim=tr.load("Stanford_Bunny.stl")
trim.rezero()
triangle_list_tr=trim.triangles
triangle_list=triangle_list_tr.tolist()

for triangle in triangle_list:
    'vertices of surface triangle'
    P1 = triangle[0]
    P2 = triangle[1]
    P3 = triangle[2]

    n = crosS(minus(P1, P2), minus(P3, P2))
    n_sum = n[0] + n[1] + n[2]
    n[0] = n[0] / abs(n_sum)
    n[1] = n[1] / abs(n_sum)
    n[2] = n[2] / abs(n_sum)

    P1x = triangle[0][0]
    P1y = triangle[0][1]
    P1z = triangle[0][2]
    P2x = triangle[1][0]
    P2y = triangle[1][1]
    P2z = triangle[1][2]
    P3x = triangle[2][0]
    P3y = triangle[2][1]
    P3z = triangle[2][2]
    counter+=1
    
    #Set A start
    bbxmin = min(P1x, P2x, P3x)
    bbxmax = max(P1x, P2x, P3x)
    bbymin = min(P1y, P2y, P3y)
    bbymax = max(P1y, P2y, P3y)
    bbzmin = min(P1z, P2z, P3z)
    bbzmax = max(P1z, P2z, P3z)
    #Set A End
    
    #Set B Start
    P1_xy = [P1[0], P1[1]]
    P2_xy = [P2[0], P2[1]]
    P3_xy = [P3[0], P3[1]]
    
    if n[2] >= 0:
        e_xy_12 = [-1 * (P1[1] - P2[1]), P1[0] - P2[0]]
        e_xy_23 = [-1 * (P2[1] - P3[1]), P2[0] - P3[0]]
        e_xy_31 = [-1 * (P3[1] - P1[1]), P3[0] - P1[0]]
    else:
        e_xy_12 = [-1 * (P2[1] - P1[1]), P2[0] - P1[0]]
        e_xy_23 = [-1 * (P3[1] - P2[1]), P3[0] - P2[0]]
        e_xy_31 = [-1 * (P1[1] - P3[1]), P1[0] - P3[0]]

   
    P1_xz = [P1[0], P1[2]]
    P2_xz = [P2[0], P2[2]]
    P3_xz = [P3[0], P3[2]]
    
    if n[1] >= 0:
        e_xz_12 = [-1 * (P2[2] - P1[2]), P2[0] - P1[0]]
        e_xz_23 = [-1 * (P3[2] - P2[2]), P3[0] - P2[0]]
        e_xz_31 = [-1 * (P1[2] - P3[2]), P1[0] - P3[0]]
    else:
        e_xz_12 = [-1 * (P1[2] - P2[2]), P1[0] - P2[0]]
        e_xz_23 = [-1 * (P2[2] - P3[2]), P2[0] - P3[0]]
        e_xz_31 = [-1 * (P3[2] - P1[2]), P3[0] - P1[0]]
    
    

    P1_yz = [P1[1], P1[2]]
    P2_yz = [P2[1], P2[2]]
    P3_yz = [P3[1], P3[2]]
    
    if n[0] >= 0:
        e_yz_12 = [-1 * (P1[2] - P2[2]), P1[1] - P2[1]]
        e_yz_23 = [-1 * (P2[2] - P3[2]), P2[1] - P3[1]]
        e_yz_31 = [-1 * (P3[2] - P1[2]), P3[1] - P1[1]]
    else:
        e_yz_12 = [-1 * (P2[2] - P1[2]), P2[1] - P1[1]]
        e_yz_23 = [-1 * (P3[2] - P2[2]), P3[1] - P2[1]]
        e_yz_31 = [-1 * (P1[2] - P3[2]), P1[1] - P3[1]]
    #Set B End

    del bbxmin
    del bbxmax
    del bbymin
    del bbymax
    del bbzmin
    del bbzmax
    del P1x
    del P1y
    del P1z
    del P2x
    del P2y
    del P2z
    del P3x
    del P3y
    del P3z
    del P1_xy
    del P2_xy
    del P3_xy
    del P1_xz
    del P2_xz
    del P3_xz
    del P1_yz
    del P2_yz
    del P3_yz
    del P1
    del P2
    del P3
    del e_yz_12
    del e_yz_23
    del e_yz_31
    del e_xz_12
    del e_xz_23
    del e_xz_31
    del e_xy_12
    del e_xy_23
    del e_xy_31

Answer 1

如果代码有很多嵌套，则Pypy较慢。 CPython已经针对其他方式进行了优化。如果您以正确的方式（而不是C）使用python，则pypy会更快。

PyPy的性能急剧下降

1 个答案: