Question

我正在尝试优化构建细胞自动机的大型阵列。目前，这是一个非常简单的程序。它创建了一个滑翔机对象，并运行了100代。当我尝试扩大网格的大小时，就会出现问题。它将从20X20网格中每代不到一秒的时间跳到50X50网格中的2秒到100x100网格的10秒。我在网上阅读这是由O表示法引起的，我想知道如何才能优化它。谢谢。

import numpy as np
import matplotlib.pyplot as plt
from IPython.display import clear_output
import time

def update(x):

    rows, cols = x.shape
    xnew = np.zeros(x.shape)
    for i, j in np.ndindex(x.shape):
        total = (x[i, (j-1)%cols] #left
                  + x[i, (j+1)%cols] #right
                  + x[(i-1)%rows, j] #up
                  + x[(i+1)%rows, j] #down
                  + x[(i-1)%rows, (j-1)%cols] #top left
                  + x[(i-1)%rows, (j+1)%cols] #top right
                  + x[(i+1)%rows, (j-1)%cols] #down left
                  + x[(i+1)%rows, (j+1)%cols]) #down right   
        if x[i,j] == 1 : #living cells
            if (total < 2) or (total > 3):
                xnew[i,j] =0
            else:
                xnew[i,j] = 1
        else:
            if total == 3:
                xnew[i,j] = 1
    return xnew   


def plant_seed(seed, nrow, ncol):

    def ceil(a, b):
        return -(-a // b)

    soil = np.zeros((nrow,ncol))

    rowu = soil.shape[0]//2 + seed.shape[0]//2
    rowl = ceil(soil.shape[0],2) - ceil(seed.shape[0],2)
    colu = soil.shape[1]//2 + seed.shape[1]//2
    coll = ceil(soil.shape[1],2) - ceil(seed.shape[1],2)

    soil[rowl:rowu, coll:colu] = seed
    planted_seed = soil

    return planted_seed

glider = np.array([[0,1,0],
                  [0,0,1],
                  [1,1,1]])


print("Enter Universe Dimensions")
rows = input("X size:")
rows = int(rows)
cols = input("Y size:")
cols = int(cols)
print("Creating",rows,"by",cols,"universe.")
iteration = plant_seed(glider, rows, cols)
n=100
tic = time.perf_counter()
for i in range(n):  
    plt.figure(figsize = (rows,cols))
    plt.imshow(iteration   ,cmap='gray')
    plt.show()
    iteration = update(iteration)
    clear_output(wait = True)
    toc = time.perf_counter()
    print(toc - tic)

Answer 1

问题来自直接访问numpy矩阵（例如x[i,j]）。实际上，这样的操作非常慢，应该避免。如果需要快速代码，则需要使用向量化操作。

这是使用向量化numpy调用的更快实现：

def update_v2(x):
    rows, cols = x.shape
    xnew = np.zeros(x.shape)

    for i in range(x.shape[0]):
        jStart = 1
        jEnd = x.shape[1] - 1

        # Center of the grid

        total = np.zeros(max(x.shape[1] - 2, 0))
        total += x[i%rows, jStart-1:jEnd-1]   # left
        total += x[i%rows, jStart+1:jEnd+1]   # right
        total += x[(i-1)%rows, jStart:jEnd]     # up
        total += x[(i+1)%rows, jStart:jEnd]     # down
        total += x[(i-1)%rows, jStart-1:jEnd-1] # top left
        total += x[(i-1)%rows, jStart+1:jEnd+1] # top right
        total += x[(i+1)%rows, jStart-1:jEnd-1] # down left
        total += x[(i+1)%rows, jStart+1:jEnd+1] # down right

        livingCells = x[i, jStart:jEnd] == 1
        stayAlive = np.bitwise_and(livingCells, np.bitwise_and(total >= 2, total <= 3))
        birth = total == 3
        beAlive = np.bitwise_or(stayAlive, birth)
        xnew[i, jStart:jEnd][beAlive] = 1

        # Borders of the grid

        for j in [0, x.shape[1]-1]:
            total = (x[i, (j-1)%cols] #left
                      + x[i, (j+1)%cols] #right
                      + x[(i-1)%rows, j] #up
                      + x[(i+1)%rows, j] #down
                      + x[(i-1)%rows, (j-1)%cols] #top left
                      + x[(i-1)%rows, (j+1)%cols] #top right
                      + x[(i+1)%rows, (j-1)%cols] #down left
                      + x[(i+1)%rows, (j+1)%cols]) #down right   
            if x[i,j] == 1: #living cells
                if (total < 2) or (total > 3):
                    xnew[i,j] = 0
                else:
                    xnew[i,j] = 1
            else:
                if total == 3:
                    xnew[i,j] = 1
    return xnew

在我的计算机上，使用1000x1000网格，此实现比初始实现快110倍。矩阵越大，速度越大（例如，使用4000x4000时速度提高300倍）。我还建议您在为此代码构建数组/矩阵时指定dtype=np.int8，因为小整数足以进行此计算，并且比默认dtype=np.float64更快。此更改可将性能提高30％，并将大型矩阵的内存占用量除以8。

请注意，使用numepxr python模块可以使此代码更快。

有没有一种方法可以优化大型2D阵列？

1 个答案: