固定点算法中的内存分配

时间:2015-06-10 23:23:08

标签: algorithm julia

我需要找到函数f的固定点。算法非常简单:

  1. 给定X,计算f(X)
  2. 如果|| X-f(X)||低于一定的公差,退出并返回X, 否则将X设为等于f(X)并返回1.
  3. 我想确保我不会在每次迭代时为新对象分配内存

    目前,算法如下所示:

    iter1 = function(x::Vector{Float64})
        for iter in 1:max_it
            oldx = copy(x)
            g1(x)
            delta = vnormdiff(x, oldx, 2)
            if delta < tolerance
                break
            end
        end
    end
    

    此处g1(x)是将x设置为f(x)

    的函数

    但似乎这个循环在每个循环中分配一个新的向量(见下文)。

    编写算法的另一种方法如下:

    iter2 = function(x::Vector{Float64})
        oldx = similar(x)
        for iter in 1:max_it
            (oldx, x) = (x, oldx)
            g2(x, oldx)
            delta = vnormdiff(oldx, x, 2)
            if delta < tolerance
                break
            end
        end
    end
    

    其中g2(x1, x2)是将x1设置为f(x2)的函数。

    这是编写这种迭代问题的最有效和最自然的方式吗?

    Edit1:时序显示第二个代码更快:

    using NumericExtensions
    max_it = 1000
    tolerance = 1e-8
    max_it = 100
    
    g1 = function(x::Vector{Float64}) 
        for i in 1:length(x)
            x[i] = x[i]/2
        end
    end
    
    g2 = function(newx::Vector{Float64}, x::Vector{Float64}) 
        for i in 1:length(x)
            newx[i] = x[i]/2
        end
    end
    
    x = fill(1e7, int(1e7))
    @time iter1(x)
    # elapsed time: 4.688103075 seconds (4960117840 bytes allocated, 29.72% gc time)
    x = fill(1e7, int(1e7))
    @time iter2(x)
    # elapsed time: 2.187916177 seconds (80199676 bytes allocated, 0.74% gc time)
    

    Edit2:使用copy!

    iter3 = function(x::Vector{Float64})
        oldx = similar(x)
        for iter in 1:max_it
            copy!(oldx, x)
            g1(x)
            delta = vnormdiff(x, oldx, 2)
            if delta < tolerance
                break
            end
        end
    end
    x = fill(1e7, int(1e7))
    @time iter3(x)
    # elapsed time: 2.745350176 seconds (80008088 bytes allocated, 1.11% gc time)
    

1 个答案:

答案 0 :(得分:2)

我认为在第一个代码中替换以下行

for iter = 1:max_it
    oldx = copy( x )
    ...

通过

oldx = zeros( N )
for iter = 1:max_it
    oldx[:] = x    # or copy!( oldx, x )
    ...

将更有效,因为没有分配数组。此外,通过显式写入for循环可以使代码更高效。例如,可以从以下比较中看出这一点

function test()
    N = 1000000

    a = zeros( N )
    b = zeros( N )

    @time c = copy( a )

    @time b[:] = a

    @time copy!( b, a )

    @time \
    for i = 1:length(a)
        b[i] = a[i]
    end

    @time \
    for i in eachindex(a)
        b[i] = a[i]
    end
end

test()

在Linux(x86_64)上使用Julia0.4.0获得的结果是

elapsed time: 0.003955609 seconds (7 MB allocated)
elapsed time: 0.001279142 seconds (0 bytes allocated)
elapsed time: 0.000836167 seconds (0 bytes allocated)
elapsed time: 1.19e-7 seconds (0 bytes allocated)
elapsed time: 1.28e-7 seconds (0 bytes allocated)

copy!()似乎比在左侧使用[:]更快, 虽然在重复计算中差异变得微不足道(似乎有 第一次[:]计算的一些开销。顺便说一下,使用eachindex()的最后一个例子非常便于循环多维数组。

可以对vnormdiff()进行类似的比较,其中norm( x - oldx )等的使用比向量范数的显式循环慢,因为前者为x - oldx分配了一个临时数组。