我正在尝试优化构建细胞自动机的大型阵列。目前,这是一个非常简单的程序。它创建了一个滑翔机对象,并运行了100代。当我尝试扩大网格的大小时,就会出现问题。它将从20X20网格中每代不到一秒的时间跳到50X50网格中的2秒到100x100网格的10秒。我在网上阅读这是由O表示法引起的,我想知道如何才能优化它。谢谢。
import numpy as np
import matplotlib.pyplot as plt
from IPython.display import clear_output
import time
def update(x):
rows, cols = x.shape
xnew = np.zeros(x.shape)
for i, j in np.ndindex(x.shape):
total = (x[i, (j-1)%cols] #left
+ x[i, (j+1)%cols] #right
+ x[(i-1)%rows, j] #up
+ x[(i+1)%rows, j] #down
+ x[(i-1)%rows, (j-1)%cols] #top left
+ x[(i-1)%rows, (j+1)%cols] #top right
+ x[(i+1)%rows, (j-1)%cols] #down left
+ x[(i+1)%rows, (j+1)%cols]) #down right
if x[i,j] == 1 : #living cells
if (total < 2) or (total > 3):
xnew[i,j] =0
else:
xnew[i,j] = 1
else:
if total == 3:
xnew[i,j] = 1
return xnew
def plant_seed(seed, nrow, ncol):
def ceil(a, b):
return -(-a // b)
soil = np.zeros((nrow,ncol))
rowu = soil.shape[0]//2 + seed.shape[0]//2
rowl = ceil(soil.shape[0],2) - ceil(seed.shape[0],2)
colu = soil.shape[1]//2 + seed.shape[1]//2
coll = ceil(soil.shape[1],2) - ceil(seed.shape[1],2)
soil[rowl:rowu, coll:colu] = seed
planted_seed = soil
return planted_seed
glider = np.array([[0,1,0],
[0,0,1],
[1,1,1]])
print("Enter Universe Dimensions")
rows = input("X size:")
rows = int(rows)
cols = input("Y size:")
cols = int(cols)
print("Creating",rows,"by",cols,"universe.")
iteration = plant_seed(glider, rows, cols)
n=100
tic = time.perf_counter()
for i in range(n):
plt.figure(figsize = (rows,cols))
plt.imshow(iteration ,cmap='gray')
plt.show()
iteration = update(iteration)
clear_output(wait = True)
toc = time.perf_counter()
print(toc - tic)
答案 0 :(得分:0)
问题来自直接访问numpy矩阵(例如x[i,j]
)。
实际上,这样的操作非常慢,应该避免。如果需要快速代码,则需要使用向量化操作。
这是使用向量化numpy调用的更快实现:
def update_v2(x):
rows, cols = x.shape
xnew = np.zeros(x.shape)
for i in range(x.shape[0]):
jStart = 1
jEnd = x.shape[1] - 1
# Center of the grid
total = np.zeros(max(x.shape[1] - 2, 0))
total += x[i%rows, jStart-1:jEnd-1] # left
total += x[i%rows, jStart+1:jEnd+1] # right
total += x[(i-1)%rows, jStart:jEnd] # up
total += x[(i+1)%rows, jStart:jEnd] # down
total += x[(i-1)%rows, jStart-1:jEnd-1] # top left
total += x[(i-1)%rows, jStart+1:jEnd+1] # top right
total += x[(i+1)%rows, jStart-1:jEnd-1] # down left
total += x[(i+1)%rows, jStart+1:jEnd+1] # down right
livingCells = x[i, jStart:jEnd] == 1
stayAlive = np.bitwise_and(livingCells, np.bitwise_and(total >= 2, total <= 3))
birth = total == 3
beAlive = np.bitwise_or(stayAlive, birth)
xnew[i, jStart:jEnd][beAlive] = 1
# Borders of the grid
for j in [0, x.shape[1]-1]:
total = (x[i, (j-1)%cols] #left
+ x[i, (j+1)%cols] #right
+ x[(i-1)%rows, j] #up
+ x[(i+1)%rows, j] #down
+ x[(i-1)%rows, (j-1)%cols] #top left
+ x[(i-1)%rows, (j+1)%cols] #top right
+ x[(i+1)%rows, (j-1)%cols] #down left
+ x[(i+1)%rows, (j+1)%cols]) #down right
if x[i,j] == 1: #living cells
if (total < 2) or (total > 3):
xnew[i,j] = 0
else:
xnew[i,j] = 1
else:
if total == 3:
xnew[i,j] = 1
return xnew
在我的计算机上,使用1000x1000网格,此实现比初始实现快110倍。矩阵越大,速度越大(例如,使用4000x4000时速度提高300倍)。我还建议您在为此代码构建数组/矩阵时指定dtype=np.int8
,因为小整数足以进行此计算,并且比默认dtype=np.float64
更快。此更改可将性能提高30%,并将大型矩阵的内存占用量除以8。
请注意,使用numepxr python模块可以使此代码更快。