Question

这是我的python代码：

from numpy import *
from copy import *

def Grid(s, p):
    return random.binomial(1, p, (s,s))

def InitialSpill(G, i, j):
    G[i, j] = 2

def Fillable(G, i, j):
    if i > 0 and G[i - 1, j] == 2:
            return True
    if j > 0 and G[i, j - 1] == 2:
            return True
    if i < len(G) - 1 and G[i + 1, j] == 2:
            return True
    if j < len(G) - 1 and G[i, j + 1] == 2:
            return True
    return False

def Fill(G):
    F = copy(G)
    for i in range(len(G)):
        for j in range(len(G)):
            if F[i, j] == 2:
                G[i, j] = 3 # 3 denote a "dry" cell
            elif F[i, j] == 1 and Fillable(F, i, j):
                G[i, j] = 2 # 2 denote a "filled" cell

def EndReached(G): # Check if all filled cells are dry and if no cells are fillable
    for i in range(len(G)):
        for j in range(len(G)):
            if (G[i, j] == 1 and Fillable(G, i, j)) or G[i, j] == 2:
                    return False
    return True

def Prop(G): # yield the ratio between dry and total fillable cells
    (dry, unrch) = (0, 0)
    for e in G:
        for r in e:
            if r == 1:
                unrch += 1
            if r == 3:
                dry += 1
    if unrch == 0 and dry < 2:
        return 0
    return dry / (unrch + dry)

def Percolate(s, p, i, j): #Percolate a generated matrix of size n, probability p
    G = Grid(s, p)
    InitialSpill(G, i, j)
    while not EndReached(G):
        Fill(G)
    return Prop(G)

def PercList(s, i, j, n, l):
    list_p = linspace(0, 1, n)
    list_perc = []
    for p in list_p:
        sum = 0
        for i in range(l):
            sum += Percolate(s, p, i, j)
        list_perc += [sum/l]
    return (list_p, list_perc)

这个想法是用矩阵表示一个可渗透的场，其中：

0是一个完整的，无法填充的单元格
1是一个空的可填充单元格
2是填充的细胞（将变干=> 3）
3是干电池（已经填充，因此不可填充）

我想表示干/总可填充细胞的比例是p的函数（细胞在基质中充满的概率）。

但是，我的代码非常低效（即使使用小值，也需要花费大量时间才能完成）。

我该如何优化它？

Answer 1

这段代码效率不高，因为当你使用numpy时，大多数计算都是按元素完成的（2D数组元素上的双循环等），这在Python中很慢。

有两种可能的方法来加快速度，

您可以对代码进行矢量化，以便对数组使用优化的numpy运算符。例如，不是让函数Fillable(G, i, j)返回一个布尔值，而是定义一个函数Fillable(G)，它返回一个布尔numpy数组，用于所有i，j索引。时间（不使用循环），

def Fillable(G):
     res = np.zeros(G.shape, dtype='bool')
     # if i > 0 and G[i - 1, j] == 2:
     # find indices where this is true
     imask, jmask = np.where(G == 2)
     imask, jmask = imask.copy(), jmask.copy() # make them writable
     imask -= 1   # shift i indices
     imask, jmask = imask[imask>=0], jmask[jmask>=0]
     res[imask, jmask] = True
     # [..] do the other conditions in a similar way
     return res

这允许我们从Fill(G)函数中删除双循环，例如使用

def Fill(G):
    G[G==2] = 3
    G[(G==1) & Fillable(G)] = 2

这些代码中的大多数都可以用类似的方式重写。

另一种选择是使用Cython。在这种情况下，代码结构可以保持不变，只需添加类型就可以大大加快速度。例如，Cython的优化函数Fill将是

cpdef int Fill(int [:,::1] G):
    cdef int [:,::1] F = G.base.copy()
    cdef int i, j, N
    N = G.base.shape[0]
    for i in range(N):
        for j in range(N):
            if F[i, j] == 2:
                G[i, j] = 3 
            elif F[i, j] == 1 and Fillable(F, i, j):
                G[i, j] = 2 
    return 0

在任何情况下，您应首先profile your code，以查看哪些函数调用占用的时间最多。只需优化一个或两个关键功能就可以将速度提高一个数量级。

优化泛洪填充算法（渗流理论）

1 个答案: