我有几个.txt文件包含由各种RNG生成的大量整数(大约250万)。我想使用diehard测试套件来测试这些RNG。
.txt文件如下所示:
#==============================================
# generator Park seed = 1
#=============================================
type: d
count: 2500000
numbit: 32
16807
282475249
其次是更多整数。我使用以下命令使用此.txt文件运行diehard
dieharder -f randdata.txt -a - g 202
我的问题是,我的.txt文件是否正确(特别是前几行),为什么这些行是必需的?我问这个的原因是因为某些RNG生成的每个.txt文件(一些好的,一些坏的)几乎每次测试都会失败,我想知道这是否是因为我在将.txt文件传递给diehard时犯了一些错误。我的RNG一切都很糟糕。
答案 0 :(得分:9)
是的,该输入文件看起来是正确的。似乎许多dieharder
测试都失败了,即使是由顽固分子自己的生成器生成的10M输入:
$ dieharder -o -f example.input -t 10000000 # Generate an input file $ head -n 10 example.input #================================================================== # generator mt19937 seed = 3423143424 #================================================================== type: d count: 10000000 numbit: 32 2310531048 808929469 2423056114 4237891648 $ dieharder -a -g 202 -f example.input #=============================================================================# # dieharder version 3.31.1 Copyright 2003 Robert G. Brown # #=============================================================================# rng_name | filename |rands/second| file_input| example.input| 2.50e+06 | #=============================================================================# test_name |ntup| tsamples |psamples| p-value |Assessment #=============================================================================# # The file file_input was rewound 1 times diehard_birthdays| 0| 100| 100|0.07531570| PASSED # The file file_input was rewound 11 times diehard_operm5| 0| 1000000| 100|0.00000000| FAILED # The file file_input was rewound 24 times diehard_rank_32x32| 0| 40000| 100|0.00047786| WEAK # The file file_input was rewound 30 times diehard_rank_6x8| 0| 100000| 100|0.38082242| PASSED # The file file_input was rewound 32 times diehard_bitstream| 0| 2097152| 100|0.56232583| PASSED # The file file_input was rewound 53 times diehard_opso| 0| 2097152| 100|0.83072458| PASSED
我不确切地知道你需要多少样品才能得到更好的"结果......但只有2.5M数字的故障似乎是预期的。
经过一些实验,似乎测试开始传递大约120MB的随机二进制数据:
$ dd if=/dev/urandom of=/tmp/random bs=4096 count=30000 30000+0 records in 30000+0 records out 122880000 bytes transferred in 10.873818 secs (11300538 bytes/sec) $ du -sh /tmp/random 117M /tmp/random $ dieharder -a -g 201 -f /tmp/random #=============================================================================# # dieharder version 3.31.1 Copyright 2003 Robert G. Brown # #=============================================================================# rng_name | filename |rands/second| file_input_raw| /tmp/random| 1.11e+07 | #=============================================================================# test_name |ntup| tsamples |psamples| p-value |Assessment #=============================================================================# diehard_birthdays| 0| 100| 100|0.71230346| PASSED # The file file_input_raw was rewound 3 times diehard_operm5| 0| 1000000| 100|0.62093817| PASSED # The file file_input_raw was rewound 7 times diehard_rank_32x32| 0| 40000| 100|0.02228171| PASSED # The file file_input_raw was rewound 9 times diehard_rank_6x8| 0| 100000| 100|0.20698623| PASSED # The file file_input_raw was rewound 10 times diehard_bitstream| 0| 2097152| 100|0.55567887| PASSED # The file file_input_raw was rewound 17 times diehard_opso| 0| 2097152| 100|0.20799917| PASSED
对应于122,880,000 / 4 = 30,720,000
- 所以约31M整数。
答案 1 :(得分:1)
原因是对于某些测试可能需要非常大量的数据,假设它是 X。如果您的文件大小 “文件file_input_raw被倒带3次” 这意味着测试需要一个大 3 倍的文件。否则测试只是测试你的文件 3 次,所以显然有很多重复,熵降低了很多