Question

考虑两个1-dim数组，一个包含可供选择的项目，另一个包含绘制另一个列表项目的概率。

items = ["a", 2, 5, "h", "hello", 3]
weights = [0.1, 0.1, 0.2, 0.2, 0.1, 0.3]

在Julia中，如何使用items随机选择weights中的项目来加权绘制给定项目的概率？

Answer 1

使用StatsBase.jl包，即

Pkg.add("StatsBase")  # Only do this once, obviously
using StatsBase
items = ["a", 2, 5, "h", "hello", 3]
weights = [0.1, 0.1, 0.2, 0.2, 0.1, 0.3]
sample(items, WeightVec(weights))

或者如果你想抽样很多：

# With replacement
my_samps = sample(items, WeightVec(weights), 10)
# Without replacement
my_samps = sample(items, WeightVec(weights), 2, replace=false)

（请注意，在Julia＆gt; = 1.0中，您应该将WeightVec替换为Weights）。

您可以详细了解WeightVec及其存在原因in the docs。 StatsBase中的采样算法非常有效，并根据输入的大小设计使用不同的方法。

Answer 2

这是一个更简单的方法，它只使用 Julia 的基础库：

sample(items, weights) = items[findfirst(cumsum(weights) .> rand())]

示例：

>>> sample(["a", 2, 5, "h", "hello", 3], [0.1, 0.1, 0.2, 0.2, 0.1, 0.3])
"h"

这比 StatsBase.jl 效率低，但对于小向量来说没问题。

此外，如果 weights 不是归一化向量，则需要执行：cumsum(weights ./ sum(weights))。

如何从Julia中的加权数组中选择随机项？

2 个答案: