ScalaCheck:选择具有自定义概率分布的整数

时间:2016-02-15 22:09:27

标签: scala scalacheck

我想在ScalaCheck中创建一个生成器,生成介于1和100之间的数字,但是对于接近1的数字会产生钟形偏差。

Gen.choose()在最小值和最大值之间随机分配数字:

scala> (1 to 10).flatMap(_ => Gen.choose(1,100).sample).toList.sorted
res14: List[Int] = List(7, 21, 30, 46, 52, 64, 66, 68, 86, 86)

Gen.chooseNum()对上限和下限有一个额外的偏见:

scala> (1 to 10).flatMap(_ => Gen.chooseNum(1,100).sample).toList.sorted
res15: List[Int] = List(1, 1, 1, 61, 85, 86, 91, 92, 100, 100)

我想要一个choose()函数,它会给我一个看起来像这样的结果:

scala> (1 to 10).flatMap(_ => choose(1,100).sample).toList.sorted
res15: List[Int] = List(1, 1, 1, 2, 5, 11, 18, 35, 49, 100)

我看到choose()chooseNum()采用隐式Choose特征作为参数。我应该使用它吗?

2 个答案:

答案 0 :(得分:5)

您可以使用Gen.frequency() (1)

 val frequencies = List(
   (50000, Gen.choose(0, 9)),
   (38209, Gen.choose(10, 19)),
   (27425, Gen.choose(20, 29)),
   (18406, Gen.choose(30, 39)),
   (11507, Gen.choose(40, 49)),
   ( 6681, Gen.choose(50, 59)),
   ( 3593, Gen.choose(60, 69)),
   ( 1786, Gen.choose(70, 79)),
   (  820, Gen.choose(80, 89)),
   (  347, Gen.choose(90, 100))
 )

 (1 to 10).flatMap(_ => Gen.frequency(frequencies:_*).sample).toList
 res209: List[Int] = List(27, 21, 31, 1, 21, 18, 9, 29, 69, 29)

我从https://en.wikipedia.org/wiki/Standard_normal_table#Complementary_cumulative获得了频率。代码只是表格的一个示例(%3或mod 3),但我认为你可以得到这个想法。

答案 1 :(得分:3)

我不能相信这一点,并会指出这个优秀的页面: http://www.javamex.com/tutorials/random_numbers/gaussian_distribution_2.shtml

很多这取决于你的意思"钟声"。您的示例不显示任何负数,但数字" 1"除非它是一个非常非常小的钟,否则它不会产生任何负数。

原谅可变循环,但有时当我必须拒绝集合构建中的值时,我会使用它们:

object Test_Stack extends App {

  val r = new java.util.Random()

  val maxBellAttempt = 102
  val stdv = maxBellAttempt / 3  //this number * 3 will happen about 99% of the time


  val collectSize = 100000
  var filled = false


  val l = scala.collection.mutable.Buffer[Int]()

  //ref article above "What are the minimum and maximum values with nextGaussian()?"

  while(l.size < collectSize){

    val temp = (r.nextGaussian() * stdv + 1).abs.round.toInt //the +1 is the mean(avg) offset. can be whatever
    //the abs is clipping the curve in half you could remove it but you'd need to move the +1 over more

    if (temp <= maxBellAttempt) l+= temp

  }

  val res = l.to[scala.collection.immutable.Seq]
  //println(res.mkString("\n"))
}

这是我刚刚将输出粘贴到excel中的分布,然后做了一个&#34; countif&#34;显示每个的频率: enter image description here