使用矢量化迭代部分进行Python代码优化

时间:2018-12-11 03:10:16

标签: python numpy optimization

我是python和numpy的新手,我想对这二阶PDE代码进行矢量化处理,以便在更短的时间内运行,但由于它使用函数来填充卡住的网格中的每个值。

def func(x, y):
    return (-2 * np.pi ** 2) * np.sin(np.pi * x) * np.sin(np.pi * y)


def f1t(x, y):
    return (np.sin(np.pi * x) * np.sin(np.pi * y))


def five_point(grid, i, j, h, grid_x, grid_y):
    return ((grid[i + 1, j] + grid[i - 1, j] + grid[i, j + 1] + grid[i, j - 1]) / 4
            - ((h ** 2) / 4) * func(grid_x[i, 0], grid_y[0, j]))


def five_point_fin_int(X=np.ones(1), Y=np.ones(1), n_x=32, K_max=1000,
                       tol=0.0001, tol_type="grid"):
    import time;
    t0 = time.clock()

    h = 1 / n_x
    X_max = int(X / h)
    Y_max = int(Y / h)

    grid_x, grid_y = np.mgrid[0:X + h:h, 0:Y + h:h]
    grid_true = np.zeros((X_max + 1, Y_max + 1))

    for i in range(1, X_max):
        for j in range(1, Y_max):
            grid_true[i, j] = f1t(grid_x[i, 0], grid_y[0, j])

    grid = np.zeros((X_max + 1, Y_max + 1))
    grid_err = np.zeros((X_max + 1, Y_max + 1))

    count = 0
    tol_max = False
    while ((count < K_max) & (tol_max == False)):
        count += 1

        for i in range(1, X_max):
            for j in range(1, Y_max):
                grid[i, j] = five_point(grid, i, j, h, grid_x, grid_y)
                grid_err[i, j] = (grid[i, j] - grid_true[i, j])

                if (tol_type.lower() == "grid" ):
                    if (np.amax(abs(grid_err)) < tol):
                        tol_max = True
                else:
                    if (abs(np.linalg.norm(grid) - np.linalg.norm(grid_true)) < tol):
                        tol_max = True

    cpu_time = time.clock() - t0

最后,我打印了计算时间,因为它嵌套了for循环,现在花费的时间大约是9秒,我想就此进行即兴创作。

1 个答案:

答案 0 :(得分:1)

numpy允许您通过矢量调用替换循环。您绝对可以执行以下操作:

grid_true = np.zeros((X_max + 1, Y_max + 1))
grid_true[1:X_max,1:Y_max]=f1t(*np.meshgrid(grid_x[1:X_max,0], grid_y[0,1:Y_max]))

您还可以尝试以下操作:

grid = np.zeros((X_max + 1, Y_max + 1))
grid[1:-1, 1:-1] = five_point(grid, *np.meshgrid(np.arange(1,X_max), np.arange(1, Y_max)),
h, grid_x, grid_y)

但是,这并不是像您正在做的那样纯粹的“上游”集成,因为您必须在每个步骤(您的调用!)中一起计算所有网格。

最小化例程可能会做得更好。对于短循环或小的向量,numpy和纯python的性能没有太大差异。