Question

尝试创建一个程序，该程序将生成两个随机数（0＆1）并将它们存储在数组中并打印它们，这是我成功完成的，问题是我需要以80％的概率生成数字1和数字0的概率为20％。

已经用rand()%10用随机1和0填充数组由于生成的随机数在0到10之间，因此我使用的逻辑是，如果随机数大于5，则将其存储在数组中为“ 1”；如果小于5，则将其存储在数组中为“ 0”

for(i=0;i<=n_gen;i++)               // for allele array
{
     randallele[i]=rand()%10 +1;
     if(randallele[i]>=5)
     {
         randallele[i]=1;
     }
     else
     {
         randallele[i]=0;
     }

}
for(i=0;i<=n_gen;i++)           //prints allele array
{
    printf("Printing the alleles:    %d\n", randallele[i]);
}

我希望输出与其概率（“ 1”为80％，“ 0”为20％）一起生成，而不是直接存储随机的1和0。

谢谢

Answer 1

因为

randallele[i]=rand()%10 +1;

获取1到10之间的数字，做

if(randallele[i]>=5)
{
    randallele[i]=1;
}
else
{
    randallele[i]=0;
}

您有5..10 = 6的可能性得到1，而0..4 = 5的可能性是0

要拥有80％1和20％0，您只需更改为：

 if(randallele[i]>=3)
 {
     randallele[i]=1;
 }
 else
 {
     randallele[i]=0;
 }

因为3..10 = 8种可能性，而仅1..2 2

获得相同结果的一种简短方法是：

randallele[i]=rand()%10 +1;
randallele[i] = (randallele[i]>=3);

所以最后

randallele[i] = ((rand()%10) >= 2);

要测试的小程序：

#include <stdio.h>
#include <stdlib.h>

int main()
{
  int n[2] = {0};

  for (int i = 0; i != 100000; i++)
    n[((rand()%10) >= 2)] += 1;

  printf("%d %d => %g%% %g%%\n",
         n[0], n[1], round(n[0] / 1000.0), round(n[1] / 1000.0));

  return 0;
}

执行：

20202 79798 => 20% 80%

注意：要始终保持相同的结果，一种简单的方法是在使用srand(time(0));之前先做rand()

Answer 2

这是您可以适应所需的任何输出格式的版本：

#include <stdbool.h>
#include <stddef.h>
#include <stdio.h>
#include <stdlib.h>
#include <time.h>

int uniform_5(void)
/* Returns 0, 1, 2, 3 or 4 with uniform probability.  Call srand() first.
 */
{
  /* Rerolling if we roll below the remainder of RAND_MAX/5 eliminates a
   * slight bias caused by RAND_MAX not being evenly divisible by 5, and
   * samples x from a uniform distribution.
   */ 
  const int x = rand();
  return (x < RAND_MAX % 5) ? uniform_5() : x % 5;
}

bool* fill_bernoulli_80( const ptrdiff_t n, bool output[n] )
/* Fills the output array with n boolean values sampled from a Bernoulli
 * distribution with p = 0.8.
 *
 * Call srand() first.
 */
{
  for ( ptrdiff_t i = 0; i < n; ++i ) {
    output[i] = uniform_5() < 4;
  }

  return output;
}

#define NSAMPLES 10000000

int main(void)
{
  static bool samples[NSAMPLES];
  const unsigned random_seed =
    (unsigned)time(NULL)*CLOCKS_PER_SEC + (unsigned)clock();

  srand(random_seed);

  fill_bernoulli_80( NSAMPLES, samples );

  size_t ones = 0;

  for ( ptrdiff_t i = 0; i < NSAMPLES; ++i )
    ones += samples[i];

  printf( "p = %.6f.\n", ones/(double)NSAMPLES );

  return EXIT_SUCCESS;
}

这里显示的是我的一些怪癖：我更喜欢将ptrdiff_t用于循环索引，因为无符号数学会导致难以检测的上溢或下溢逻辑错误（臭名昭著的1U < -3 ）和int可能是32位宽（对于64位程序）。

您可以看到函数式编程对我的尾递归辅助函数uniform_5的影响。在这种情况下，这不是主要问题，但是如果您使用RAND_MAX/2 + 2之类的大数，则您肯定会注意到，取余数并不能使您获得均匀的分布：某些数字的滚动频率是其他数字的两倍。我使用的重新滚动算法对此进行了纠正。

我将熵的两个来源合并为随机种子，即墙时间和CPU时间，因为该程序很有可能在同一时钟秒内运行两次。

默认的PRNG通常不是很好，但是您也可以轻松地替换为其他PRNG。

如何生成随机0和1，但在C中出现的可能性为80-20？

2 个答案: