所以,假设我们有一个代码块,我们想要执行70%的次数,而另一个代码块执行30%的次数。
if(Math.random() < 0.7)
70percentmethod();
else
30percentmethod();
足够简单。但是如果我们希望它可以很容易地扩展到30%/ 60%/ 10%等怎么办? 在这里,它需要添加和更改所有if变量的if语句,这些语句使用起来不是很好,缓慢和错误导致。
到目前为止,我发现大型交换机对于这个用例非常有用,例如:
switch(rand(0, 10)){
case 0:
case 1:
case 2:
case 3:
case 4:
case 5:
case 6:
case 7:70percentmethod();break;
case 8:
case 9:
case 10:30percentmethod();break;
}
可以很容易地改为:
switch(rand(0, 10)){
case 0:10percentmethod();break;
case 1:
case 2:
case 3:
case 4:
case 5:
case 6:
case 7:60percentmethod();break;
case 8:
case 9:
case 10:30percentmethod();break;
}
但这些也有它们的缺点,既麻烦又分裂成预定的分割。
理想的东西将基于我想的“频率数”系统,如下:
(1,a),(1,b),(2,c) -> 25% a, 25% b, 50% c
然后如果你又添加了一个:
(1,a),(1,b),(2,c),(6,d) -> 10% a, 10% b, 20% c, 60% d
所以简单地将数字相加,使总和等于100%,然后将其分开。
我认为使用自定义的hashmap或其他东西制作一个处理程序并不会那么麻烦,但我想知道是否有一些已建立的方式/模式或lambda为它之前我去了所有的意大利面。
答案 0 :(得分:28)
编辑:请参阅末尾的修改以获得更优雅的解决方案。我会留下这个。
您可以使用NavigableMap
存储映射到其百分比的这些方法。
NavigableMap<Double, Runnable> runnables = new TreeMap<>();
runnables.put(0.3, this::30PercentMethod);
runnables.put(1.0, this::70PercentMethod);
public static void runRandomly(Map<Double, Runnable> runnables) {
double percentage = Math.random();
for (Map.Entry<Double, Runnable> entry : runnables){
if (entry.getKey() < percentage) {
entry.getValue().run();
return; // make sure you only call one method
}
}
throw new RuntimeException("map not filled properly for " + percentage);
}
// or, because I'm still practicing streams by using them for everything
public static void runRandomly(Map<Double, Runnable> runnables) {
double percentage = Math.random();
runnables.entrySet().stream()
.filter(e -> e.getKey() < percentage)
.findFirst().orElseThrow(() ->
new RuntimeException("map not filled properly for " + percentage))
.run();
}
NavigableMap
排序(例如HashMap
不保证条目),因此您可以按百分比获得排序条目。这是相关的,因为如果您有两个项目(3,r1),(7,r2),则会产生以下条目:r1 = 0.3
和{{ 1}}并且它们需要按此顺序进行评估(例如,如果以相反的顺序对它们进行评估,结果总是为r2 = 1.0
)。
至于分裂,它应该是这样的: 使用像这样的元组类
r2
您可以创建这样的地图
static class Pair<X, Y>
{
public Pair(X f, Y s)
{
first = f;
second = s;
}
public final X first;
public final Y second;
}
所有这些都添加到了课程
// the parameter contains the (1,m1), (1,m2), (3,m3) pairs
private static Map<Double,Runnable> splitToPercentageMap(Collection<Pair<Integer,Runnable>> runnables)
{
// this adds all Runnables to lists of same int value,
// overall those lists are sorted by that int (so least probable first)
double total = 0;
Map<Integer,List<Runnable>> byNumber = new TreeMap<>();
for (Pair<Integer,Runnable> e : runnables)
{
total += e.first;
List<Runnable> list = byNumber.getOrDefault(e.first, new ArrayList<>());
list.add(e.second);
byNumber.put(e.first, list);
}
Map<Double,Runnable> targetList = new TreeMap<>();
double current = 0;
for (Map.Entry<Integer,List<Runnable>> e : byNumber.entrySet())
{
for (Runnable r : e.getValue())
{
double percentage = (double) e.getKey() / total;
current += percentage;
targetList.put(current, r);
}
}
return targetList;
}
编辑:
实际上,如果你的想法陷入困境并且没有正确地质疑,那么上面就是你得到的。
保持class RandomRunner {
private List<Integer, Runnable> runnables = new ArrayList<>();
public void add(int value, Runnable toRun) {
runnables.add(new Pair<>(value, toRun));
}
public void remove(Runnable toRemove) {
for (Iterator<Pair<Integer, Runnable>> r = runnables.iterator();
r.hasNext(); ) {
if (toRemove == r.next().second) {
r.remove();
break;
}
}
}
public void runRandomly() {
// split list, use code from above
}
}
类接口,这更容易:
RandomRunner
答案 1 :(得分:25)
所有这些答案看起来都很复杂,所以我只想发布简单易用的替代方案:
double rnd = Math.random()
if((rnd -= 0.6) < 0)
60percentmethod();
else if ((rnd -= 0.3) < 0)
30percentmethod();
else
10percentmethod();
不需要更改其他行,人们可以很容易地看到会发生什么,而无需深入研究辅助类。一个小的缺点是它不会强制百分比总和达到100%。
答案 2 :(得分:15)
我不确定这是否有一个共同的名字,但我认为我在大学里学到了这一点。
它基本上就像你描述的那样工作:它接收一个值列表和“频率数字”,并根据加权概率选择一个。
list = (1,a),(1,b),(2,c),(6,d)
total = list.sum()
rnd = random(0, total)
sum = 0
for i from 0 to list.size():
sum += list[i]
if sum >= rnd:
return list[i]
return list.last()
如果要对此进行概括,列表可以是函数参数。
这也适用于浮点数,并且数字不必标准化。如果您进行标准化(例如总计为1),则可以跳过list.sum()
部分。
编辑:
由于需求,这里是一个实际的编译java实现和用法示例:
import java.util.ArrayList;
import java.util.Random;
public class RandomWheel<T>
{
private static final class RandomWheelSection<T>
{
public double weight;
public T value;
public RandomWheelSection(double weight, T value)
{
this.weight = weight;
this.value = value;
}
}
private ArrayList<RandomWheelSection<T>> sections = new ArrayList<>();
private double totalWeight = 0;
private Random random = new Random();
public void addWheelSection(double weight, T value)
{
sections.add(new RandomWheelSection<T>(weight, value));
totalWeight += weight;
}
public T draw()
{
double rnd = totalWeight * random.nextDouble();
double sum = 0;
for (int i = 0; i < sections.size(); i++)
{
sum += sections.get(i).weight;
if (sum >= rnd)
return sections.get(i).value;
}
return sections.get(sections.size() - 1).value;
}
public static void main(String[] args)
{
RandomWheel<String> wheel = new RandomWheel<String>();
wheel.addWheelSection(1, "a");
wheel.addWheelSection(1, "b");
wheel.addWheelSection(2, "c");
wheel.addWheelSection(6, "d");
for (int i = 0; i < 100; i++)
System.out.print(wheel.draw());
}
}
答案 3 :(得分:8)
虽然所选答案有效,但遗憾的是,对于您的用例而言,渐渐缓慢。您可以使用名为Alias Sampling的内容,而不是这样做。别名采样(或别名方法)是一种用于选择具有加权分布的元素的技术。如果选择这些元素的权重没有改变,您可以在 O(1)时间内进行选择!。如果不是这种情况,如果您所做的选择数量与对别名表所做的更改(更改权重)之间的比率很高,您仍然可以获得 amortized O(1) time 。当前选择的答案表明O(N)算法,下一个最好的事情是给定排序概率的O(log(N))和binary search,但没有什么能超过我建议的O(1)时间。
This site提供了Alias方法的一个很好的概述,它主要是与语言无关的。基本上,您创建一个表,其中每个条目代表两个概率的结果。表格中的每个条目都有一个阈值,低于您获得一个值的阈值,高于您获得的另一个值。您在多个表值之间传播较大的概率,以便为所有概率组合创建面积为1的概率图。
假设你有概率A,B,C和D,它们的值分别为0.1,0.1,0.1和0.7。别名方法会将0.7的概率扩展到所有其他方法。一个指数对应于每个概率,其中ABC为0.1和0.15,D指数为0.25。通过这种方法,你可以将每个概率归一化,这样你最终得到A的概率为0.4,并且在A的指数中得到D的概率为0.6(分别为0.1 /(0.1 + 0.15)和0.15 /(0.1 + 0.15))以及B和C的指数,以及在D指数中获得D的几率为100%(0.25 / 0.25为1)。
给定用于索引的无偏均匀PRNG(Math.Random()),您可以获得选择每个索引的相等概率,但您还可以为每个索引执行硬币翻转,从而提供加权概率。你有25%的几率登陆A或D位置,但在此之内你只有40%的机会选择A,而60%的D. 40 * .25 = 0.1,我们的原始概率,如果你将所有D的概率加在其他指数中,你会得到0.70。
所以要做随机选择,你只需要生成一个从0到N的随机索引,然后做一个硬币翻转,无论你添加多少项,这都是非常快和不变的成本。制作别名表也不需要那么多行代码,我的python版本需要80行,包括import语句和换行符,而Pandas文章中提供的版本大小相同(并且它是C ++)
对于你的java实现,可以将概率和数组列表索引映射到你必须执行的函数,创建一个array of functions,它们在你为每个函数索引时执行,或者你可以使用函数对象({{3} })有一个方法,您可以使用该方法传递参数来执行。
ArrayList<(YourFunctionObject)> function_list;
// add functions
AliasSampler aliassampler = new AliasSampler(listOfProbabilities);
// somewhere later with some type T and some parameter values.
int index = aliassampler.sampleIndex();
T result = function_list[index].apply(parameters);
编辑:
我在AliasSampler方法的java中创建了一个版本,使用类,它使用了样本索引方法,应该能够像上面一样使用。
import java.util.ArrayList;
import java.util.Collections;
import java.util.Random;
public class AliasSampler {
private ArrayList<Double> binaryProbabilityArray;
private ArrayList<Integer> aliasIndexList;
AliasSampler(ArrayList<Double> probabilities){
// java 8 needed here
assert(DoubleStream.of(probabilities).sum() == 1.0);
int n = probabilities.size();
// probabilityArray is the list of probabilities, this is the incoming probabilities scaled
// by the number of probabilities. This allows us to figure out which probabilities need to be spread
// to others since they are too large, ie [0.1 0.1 0.1 0.7] = [0.4 0.4 0.4 2.80]
ArrayList<Double> probabilityArray;
for(Double probability : probabilities){
probabilityArray.add(probability);
}
binaryProbabilityArray = new ArrayList<Double>(Collections.nCopies(n, 0.0));
aliasIndexList = new ArrayList<Integer>(Collections.nCopies(n, 0));
ArrayList<Integer> lessThanOneIndexList = new ArrayList<Integer>();
ArrayList<Integer> greaterThanOneIndexList = new ArrayList<Integer>();
for(int index = 0; index < probabilityArray.size(); index++){
double probability = probabilityArray.get(index);
if(probability < 1.0){
lessThanOneIndexList.add(index);
}
else{
greaterThanOneIndexList.add(index);
}
}
// while we still have indices to check for in each list, we attempt to spread the probability of those larger
// what this ends up doing in our first example is taking greater than one elements (2.80) and removing 0.6,
// and spreading it to different indices, so (((2.80 - 0.6) - 0.6) - 0.6) will equal 1.0, and the rest will
// be 0.4 + 0.6 = 1.0 as well.
while(lessThanOneIndexList.size() != 0 && greaterThanOneIndexList.size() != 0){
//https://stackoverflow.com/questions/16987727/removing-last-object-of-arraylist-in-java
// last element removal is equivalent to pop, java does this in constant time
int lessThanOneIndex = lessThanOneIndexList.remove(lessThanOneIndexList.size() - 1);
int greaterThanOneIndex = greaterThanOneIndexList.remove(greaterThanOneIndexList.size() - 1);
double probabilityLessThanOne = probabilityArray.get(lessThanOneIndex);
binaryProbabilityArray.set(lessThanOneIndex, probabilityLessThanOne);
aliasIndexList.set(lessThanOneIndex, greaterThanOneIndex);
probabilityArray.set(greaterThanOneIndex, probabilityArray.get(greaterThanOneIndex) + probabilityLessThanOne - 1);
if(probabilityArray.get(greaterThanOneIndex) < 1){
lessThanOneIndexList.add(greaterThanOneIndex);
}
else{
greaterThanOneIndexList.add(greaterThanOneIndex);
}
}
//if there are any probabilities left in either index list, they can't be spread across the other
//indicies, so they are set with probability 1.0. They still have the probabilities they should at this step, it works out mathematically.
while(greaterThanOneIndexList.size() != 0){
int greaterThanOneIndex = greaterThanOneIndexList.remove(greaterThanOneIndexList.size() - 1);
binaryProbabilityArray.set(greaterThanOneIndex, 1.0);
}
while(lessThanOneIndexList.size() != 0){
int lessThanOneIndex = lessThanOneIndexList.remove(lessThanOneIndexList.size() - 1);
binaryProbabilityArray.set(lessThanOneIndex, 1.0);
}
}
public int sampleIndex(){
int index = new Random().nextInt(binaryProbabilityArray.size());
double r = Math.random();
if( r < binaryProbabilityArray.get(index)){
return index;
}
else{
return aliasIndexList.get(index);
}
}
}
答案 4 :(得分:6)
你可以计算每个班级的累积概率,从[0; 1)并查看该号码的位置。
class WeightedRandomPicker {
private static Random random = new Random();
public static int choose(double[] probabilties) {
double randomVal = random.nextDouble();
double cumulativeProbability = 0;
for (int i = 0; i < probabilties.length; ++i) {
cumulativeProbability += probabilties[i];
if (randomVal < cumulativeProbability) {
return i;
}
}
return probabilties.length - 1; // to account for numerical errors
}
public static void main (String[] args) {
double[] probabilties = new double[]{0.1, 0.1, 0.2, 0.6}; // the final value is optional
for (int i = 0; i < 20; ++i) {
System.out.printf("%d\n", choose(probabilties));
}
}
}
答案 5 :(得分:1)
以下有点像@daniu的答案,但使用TreeMap
提供的方法:
private final NavigableMap<Double, Runnable> map = new TreeMap<>();
{
map.put(0.3d, this::branch30Percent);
map.put(1.0d, this::branch70Percent);
}
private final SecureRandom random = new SecureRandom();
private void branch30Percent() {}
private void branch70Percent() {}
public void runRandomly() {
final Runnable value = map.tailMap(random.nextDouble(), true).firstEntry().getValue();
value.run();
}
这样就不需要迭代整个映射,直到找到匹配的条目,但是使用TreeSet
查找具有密钥的条目的功能,特别是与另一个密钥相比较。但是,如果地图中的条目数量很大,这只会产生影响。但它确实保存了几行代码。
答案 6 :(得分:0)
我做这样的事情:
class RandomMethod {
private final Runnable method;
private final int probability;
RandomMethod(Runnable method, int probability){
this.method = method;
this.probability = probability;
}
public int getProbability() { return probability; }
public void run() { method.run(); }
}
class MethodChooser {
private final List<RandomMethod> methods;
private final int total;
MethodChooser(final List<RandomMethod> methods) {
this.methods = methods;
this.total = methods.stream().collect(
Collectors.summingInt(RandomMethod::getProbability)
);
}
public void chooseMethod() {
final Random random = new Random();
final int choice = random.nextInt(total);
int count = 0;
for (final RandomMethod method : methods)
{
count += method.getProbability();
if (choice < count) {
method.run();
return;
}
}
}
}
样本用法:
MethodChooser chooser = new MethodChooser(Arrays.asList(
new RandomMethod(Blah::aaa, 1),
new RandomMethod(Blah::bbb, 3),
new RandomMethod(Blah::ccc, 1)
));
IntStream.range(0, 100).forEach(
i -> chooser.chooseMethod()
);