Question

我已经记录了一些字母在一组字符串中出现的频率，现在我想制作一些具有（大致）相同字母组成的随机字符串。我正在使用以下Perl代码来执行此操作。

my $random_string = "";

while(length($random_string) < $length)
{
  my $probabilities =
  {
    A => 0.2790114613,
    B => 0.1880372493,
    C => 0.2285100287,
    D => 0.3044412607,
  };
  my $test = 0;

  $test += $probabilities->{ A };
  if($rand < $test)
  {
    $sequence .= "A";
    next;
  }
  $test += $probabilities->{ B };
  if($rand < $test)
  {
    $sequence .= "B";
    next;
  }
  $test += $probabilities->{ C };
  if($rand < $test)
  {
    $sequence .= "C";
    next;
  }
  $sequence .= "D";
}

有更好的方法吗？当我不知道要考虑多少封信时，我该如何处理案件？我们可以放心地假设所有字母的概率总和为1。

Answer 1

检查List::Util::WeightedChoice。

Answer 2

如果您只关心合理小数位数的准确性，一种方法是构造一个包含所有具有正确相对频率的字母的字符串：

my $sample = "";

while (my ($letter, $freq) = each %$probabilities) {
    $sample .= $letter x ($freq * 1000);
}

然后从该字符串中随机选择字母：

while (length($sequence) < $length) {
    $sequence .= substr($sample, rand length $sample, 1);
}

用更大的数字替换1000以获得更高的准确度。

Answer 3

使用循环处理不知道有多少字母的情况：）

建议的模块基本上为每个选项生成一个末端权重数组（当你到达给定选择的$rand < $test时，与$ test相同的数字）并迭代它。

在Perl中绘制具有给定概率的字符

3 个答案: