在Julia中将函数参数更改为关键字似乎会引入类型不稳定性

时间:2018-06-30 12:07:51

标签: julia keyword-argument type-stability

我有一个程序,其中main()函数带有四个参数。当我在函数上运行@code_warntype时,似乎没有什么不对劲。所有变量都具有指定的类型,并且没有UNION实例或其他明显的警告信号。

抱歉,该程序相当长,但是我不确定如何在保留问题的同时将其缩短:

function main(n::Int, dice::Int=6, start::Int=1, modal::Int=3) ::Tuple{String, Vector{String}, Vector{Float64}}
    board = String["GO", "A1", "CC1", "A2", "T1", "R1", "B1", "CH1", "B2", "B3",
        "JAIL", "C1", "U1", "C2", "C3", "R2", "D1", "CC2", "D2", "D3",
        "FP", "E1", "CH2", "E2", "E3", "R3", "F1", "F2", "U2", "F3",
        "G2J", "G1", "G2", "CC3", "G3", "R4", "CH3", "H1", "T2", "H2"]
    cc_cards = shuffle(collect(1:16))
    ch_cards = shuffle(collect(1:16))
    function take_cc_card(square::Int, cards::Vector{Int})::Tuple{Int, Vector{Int}}
        if cards[1] == 1
            square = findfirst(board, "GO")
        elseif cards[1] == 2
            square = findfirst(board, "JAIL")
        end
        p = pop!(cards)
        unshift!(cards, p)
        return square, cards
    end
    function take_ch_card(square::Int, cards::Vector{Int})::Tuple{Int, Vector{Int}}
        if cards[1] == 1
            square = findfirst(board, "GO")
        elseif cards[1] == 2
            square = findfirst(board, "JAIL")
        elseif cards[1] == 3
            square = findfirst(board, "C1")
        elseif cards[1] == 4
            square = findfirst(board, "E3")
        elseif cards[1] == 5
            square = findfirst(board, "H2")
        elseif cards[1] == 6
            square = findfirst(board, "R1")
        elseif cards[1] == 7 || cards[1] == 8
            if board[square] == "CH1"
                square = findfirst(board, "R2")
            elseif board[square] == "CH2"
                square = findfirst(board, "R3")
            elseif board[square] == "CH3"
                square = findfirst(board, "R1")
            end
        elseif cards[1] == 9
            if board[square] == "CH1"
                square = findfirst(board, "U1")
            elseif board[square] == "CH2"
                square = findfirst(board, "U2")
            elseif board[square] == "CH3"
                square = findfirst(board, "U1")
            end
        elseif cards[1] == 10
            square = (square - 3) % 40 + ((square - 3 % 40 == 0 ? 40 : 0))
        end
        p = pop!(cards)
        unshift!(cards, p)
        return square, cards
    end
    result = zeros(Int, 40)
    consec_doubles = 0
    square = 1
    for i = 1:n
        throw_1 = rand(collect(1:dice))
        throw_2 = rand(collect(1:dice))
        if throw_1 == throw_2
            consec_doubles += 1
        else
            consec_doubles = 0
        end
        if consec_doubles != 3
            move = throw_1 + throw_2
            square = (square + move) % 40 +((square + move) % 40 == 0 ? 40 : 0)
            if board[square] == "G2J"
                square = findfirst(board, "JAIL")
            elseif board[square][1:2] == "CC"
                square, cc_cards = take_cc_card(square, cc_cards)
            elseif board[square][1:2] == "CH"
                square, ch_cards = take_ch_card(square, ch_cards)
                if board[square][1:2] == "CC"
                    square, cc_cards = take_cc_card(square, cc_cards)
                end
            end
        else
            square = findfirst(board, "JAIL")
            consec_doubles = 0
        end
        if i >= start
            result[square] += 1
        end
    end
    result_tuple = Vector{Tuple{Float64, Int}}()
    for i = 1:40
        percent = result[i] * 100 / sum(result)
        push!(result_tuple, (percent, i))
    end
    sort!(result_tuple, lt = (x, y) -> isless(x[1], y[1]), rev=true)
    modal_squares = Vector{String}()
    modal_string = ""
    modal_percents = Vector{Float64}()
    for i = 1:modal
        push!(modal_squares, board[result_tuple[i][2]])
        push!(modal_percents, result_tuple[i][1])
        k = result_tuple[i][2] - 1
        modal_string *= (k < 10 ? ("0" * string(k)) : string(k))
    end
    return modal_string, modal_squares, modal_percents
end

@code_warntype main(1_000_000, 4, 101, 5)

但是,当我通过在第一个参数之后插入分号而不是逗号来将关键字的后三个参数更改为关键字时...

function main(n::Int; dice::Int=6, start::Int=1, modal::Int=3) ::Tuple{String, Vector{String}, Vector{Float64}}

...我似乎遇到类型稳定性问题。

@code_warntype main(1_000_000, dice=4, start=101, modal=5)

我现在在运行ANY时得到一个UNION类型的临时变量,并且在正文中有一个@code_warntype的实例。

奇怪的是,这似乎并没有对性能造成影响,因为平均而言,三个基准测试的“参数”版本运行时间为431.594毫秒,“关键字”版本运行时间为413.149毫秒。但是,我很想知道:

(a)为什么会这样;

(b)作为一般规则,是否出现ANY类型的临时变量是否值得关注?和

(c)从性能的角度来看,使用关键字而不是普通函数参数是否从总体上讲具有任何优势。

1 个答案:

答案 0 :(得分:3)

这是我对三个问题的看法。在答案中,我假设朱莉娅0.6.3,除非我在帖子末尾明确声明我指的是朱莉娅0.7。

(a)带有Any变量的代码是该代码的一部分,负责处理关键字参数(例如,确保函数签名允许传递的关键字参数)。原因是关键字参数在函数中以Vector{Any}的形式接收。向量包含元组([argument name], [argument value])。 函数执行的实际“工作”在此部分之后带有Any变量。

您可以通过比较通话次数来查看:

@code_warntype main(1_000_000, dice=4, start=101, modal=5)

@code_warntype main(1_000_000)

用于带有关键字参数的函数。第二个调用只有上面的第一个调用生成的最后一行报告,其他所有负责处理传递的关键字参数。

(b)作为一般规则,这当然是一个值得关注的问题,但是在这种情况下这无济于事。带有Any的变量保存有关关键字参数名称的信息。

(c)通常,您可以假定位置参数不比关键字参数慢,但可以更快。这是一个MWE(实际上,如果您运行@code_warntype f(a=10),您也会看到此Any变量):

julia> using BenchmarkTools

julia> f(;a::Int=1) = a+1
f (generic function with 1 method)

julia> g(a::Int=1) = a+1
g (generic function with 2 methods)

julia> @benchmark f()
BenchmarkTools.Trial:
  memory estimate:  0 bytes
  allocs estimate:  0
  --------------
  minimum time:     1.865 ns (0.00% GC)
  median time:      1.866 ns (0.00% GC)
  mean time:        1.974 ns (0.00% GC)
  maximum time:     14.463 ns (0.00% GC)
  --------------
  samples:          10000
  evals/sample:     1000

julia> @benchmark f(a=10)
BenchmarkTools.Trial:
  memory estimate:  96 bytes
  allocs estimate:  1
  --------------
  minimum time:     52.994 ns (0.00% GC)
  median time:      54.413 ns (0.00% GC)
  mean time:        65.207 ns (10.65% GC)
  maximum time:     3.466 μs (94.78% GC)
  --------------
  samples:          10000
  evals/sample:     986

julia> @benchmark g()
BenchmarkTools.Trial:
  memory estimate:  0 bytes
  allocs estimate:  0
  --------------
  minimum time:     1.865 ns (0.00% GC)
  median time:      1.866 ns (0.00% GC)
  mean time:        1.954 ns (0.00% GC)
  maximum time:     13.062 ns (0.00% GC)
  --------------
  samples:          10000
  evals/sample:     1000

julia> @benchmark g(10)
BenchmarkTools.Trial:
  memory estimate:  0 bytes
  allocs estimate:  0
  --------------
  minimum time:     1.865 ns (0.00% GC)
  median time:      1.866 ns (0.00% GC)
  mean time:        1.949 ns (0.00% GC)
  maximum time:     13.063 ns (0.00% GC)
  --------------
  samples:          10000
  evals/sample:     1000

现在您可以看到,实际上关键字参数的惩罚是传递参数时(这是正确的,因为Any中有@code_warntype变量,因为Julia必须做更多的工作) 。请注意,代价很小,在执行很少工作的函数中就可以看到。对于执行大量计算的函数,大多数时候可以忽略它。

另外请注意,如果您不指定关键字参数类型,则当显式传递关键字参数值时,惩罚会大得多,因为Julia不会调度关键字参数类型(您也可以运行@code_warntype来见证):

julia> h(;a=1) = a+1
h (generic function with 1 method)

julia> @benchmark h()
BenchmarkTools.Trial:
  memory estimate:  0 bytes
  allocs estimate:  0
  --------------
  minimum time:     1.865 ns (0.00% GC)
  median time:      1.866 ns (0.00% GC)
  mean time:        1.960 ns (0.00% GC)
  maximum time:     13.996 ns (0.00% GC)
  --------------
  samples:          10000
  evals/sample:     1000

julia> @benchmark h(a=10)
BenchmarkTools.Trial:
  memory estimate:  96 bytes
  allocs estimate:  1
  --------------
  minimum time:     75.433 ns (0.00% GC)
  median time:      77.355 ns (0.00% GC)
  mean time:        89.037 ns (7.87% GC)
  maximum time:     2.128 μs (89.73% GC)
  --------------
  samples:          10000
  evals/sample:     971

在Julia 0.7中,关键字参数作为Base.Iterator.Pairs的{​​{1}}接收,因此Julia知道在编译时传递的参数的类型。这意味着使用关键字参数比在Julia 0.6.3中更快(但同样-您不应该期望它们比位置参数更快)。您可以看到此购买程序运行了类似的基准测试(我只是更改了什么功能以使Julia编译器有更多工作),但是在Julia 0.7下(您也可以在这些功能上查看NamedTuple看到类型推断在Julia 0.7中效果更好):

@code_warntype