通过实例列表分别扩展向量列表-Julia

时间:2019-01-02 18:17:59

标签: julia expand

我想用一个包含每个实例编号的向量扩展每个值的向量。我想出了下面的代码可以做到这一点,但似乎这是一种常见用法,所以我可能会丢失一些东西。

valuelist = ["a","b","d","z"]
numberofinstance = [3,5,1,11]

valuevector = String[]
for i in 1:length(numberofinstance) 
  append!(valuevector , repeat([valuelist[i]], numberofinstance[i])) 
end

1 个答案:

答案 0 :(得分:5)

如果您可以使用软件包(基本上是stdlib)很好,则在StatsBase.jl中将您要查找的函数称为inverse_rle

julia> using StatsBase

julia> inverse_rle(valuelist, numberofinstance)
20-element Array{String,1}:
 "a"
 "a"
 "a"
 "b"
 "b"
 "b"
 "b"
 "b"
 "d"
 "z"
 "z"
 "z"
 "z"
 "z"
 "z"
 "z"
 "z"
 "z"
 "z"
 "z"

julia> @btime inverse_rle($valuelist, $numberofinstance);
  76.799 ns (1 allocation: 240 bytes)

julia> @btime yoursolution($valuelist, $numberofinstance);
  693.329 ns (13 allocations: 1.55 KiB)

如果您想避免打包,原则上可以广播repeat^(加电),

vcat(collect.(.^(valuelist, numberofinstance))...)

但是我认为这很难解析,而且比inverse_rle慢,

julia> @btime yoursolution($valuelist, $numberofinstance);
  693.329 ns (13 allocations: 1.55 KiB)

julia> @btime vcat(collect.(.^($valuelist, $numberofinstance))...)
  472.615 ns (9 allocations: 800 bytes)

但是,由于Julia允许您编写快速循环,因此您可以轻松定义自己的简单函数。与您的解决方案相比,以下内容快得多(与implementation in StatsBase一样快):

function multiply(vs, ns)
   r = Vector{String}(undef, sum(ns))
   c = 1
   @inbounds for i in axes(ns, 1)
       for k in 1:ns[i]
           r[c] = vs[i]
           c += 1
       end
   end
   r
end

基准:

julia> @btime yoursolution($valuelist, $numberofinstance);
  693.329 ns (13 allocations: 1.55 KiB)

julia> @btime multiply($valuelist, $numberofinstance);
  76.469 ns (1 allocation: 240 bytes)