Question

我有这个代码（原始传热）：

function heat(first, second, m)
    @sync @parallel for d = 2:m - 1
        for c = 2:m - 1
            @inbounds second[c,d] = (first[c,d] + first[c+1, d] + first[c-1, d] + first[c, d+1] + first[c, d-1]) / 5.0;
        end
    end
end

m = parse(Int,ARGS[1]) #size of matrix
firstm = SharedArray(Float64, (m,m))
secondm = SharedArray(Float64, (m,m))

for c = 1:m
    for d = 1:m
        if c == m || d == 1
            firstm[c,d] = 100.0
            secondm[c,d] = 100.0
        else
            firstm[c,d] = 0.0
            secondm[c,d] = 0.0
        end
    end
end
@time for i = 0:opak
    heat(firstm, secondm, m)
    firstm, secondm = secondm, firstm
end

这段代码在顺序运行时提供了很好的时间，但是当我添加@parallel时，即使我在一个线程上运行它也会减慢速度。我只是需要解释为什么会这样？仅在不改变热函数算法的情况下编码。

Answer 1

看看http://docs.julialang.org/en/release-0.4/manual/performance-tips/。与建议相反，您经常使用全局变量。它们被认为可以随时更改类型，因此每次引用时都必须对它们进行装箱和取消装箱。这个问题也Julia pi aproximation slow也有同样的问题。为了使您的函数更快，请将全局变量作为函数的输入参数。

Answer 2

有一些要考虑的要点。其中一个是m的大小。如果它很小，并行性会给不大的收益带来很多开销：

julia 36967257.jl 4
# Parallel:
0.040434 seconds (4.44 k allocations: 241.606 KB)
# Normal:
0.042141 seconds (29.13 k allocations: 1.308 MB)

对于较大的m，您可以获得更好的结果：

julia 36967257.jl 4000
# Parallel:
0.054848 seconds (4.46 k allocations: 241.935 KB)
# Normal:
3.779843 seconds (29.13 k allocations: 1.308 MB)

另外两条评论：

1 /初始化可以简化为：

for c = 1:m, d = 1:m
    if c == m || d == 1
        firstm[c,d] = 100.0
        secondm[c,d] = 100.0
    else
        firstm[c,d] = 0.0
        secondm[c,d] = 0.0
    end
end

2 /你的有限差分模式看起来不稳定。请查看Linear multistep method或ADI / Crank Nicolson。

使用@parallel，Julia明显变慢了

2 个答案: