我十年没做过任何编程了。我想回到它,所以我把这个毫无意义的程序作为练习。 描述它的作用的最简单方法是输出我的--help codeblock:
./ prng_bench --help
./prng_bench: usage: ./prng_bench $N $B [$T]
This program will generate an N digit base(B) random number until
all N digits are the same.
Once a repeating N digit base(B) number is found, the following statistics are displayed:
-Decimal value of all N digits.
-Time & number of tries taken to randomly find.
Optionally, this process is repeated T times.
When running multiple repititions, averages for all N digit base(B)
numbers are displayed at the end, as well as total time and total tries.
我的问题"当问题是"容易",比如一个3位数的基数为10的数字,并且我做了大量的传递"总时间"用管道输送grep时少一些。即:
命令;命令| grep take:
./prng_bench 3 10 999999 ; ./prng_bench 3 10 999999|grep took
....
Pass# 999999: All 3 base(10) digits = 3 base(10). Time: 0.00005 secs. Tries: 23
It took 191.86701 secs & 99947208 tries to find 999999 repeating 3 digit base(10) numbers.
An average of 0.00019 secs & 99 tries was needed to find each one.
It took 159.32355 secs & 99947208 tries to find 999999 repeating 3 digit base(10) numbers.
如果我多次运行相同的命令没有grep时间总是非常接近。 我现在正在使用srand(1234)进行测试。我对clock_gettime()的启动和停止调用之间的代码不涉及任何流操作,这显然会影响时间。我意识到这是一种无用的练习,但我想知道它为什么会这样。 以下是该计划的核心。如果有人想编译和测试,这里是DB上完整源代码的链接。 https://www.dropbox.com/s/bczggar2pqzp9g1/prng_bench.cpp clock_gettime()需要-lrt。
for (int pass_num=1; pass_num<=passes; pass_num++) { //Executes $passes # of times.
clock_gettime(CLOCK_PROCESS_CPUTIME_ID, &temp_time); //get time
start_time = timetodouble(temp_time); //convert time to double, store as start_time
for(i=1, tries=0; i!=0; tries++) { //loops until 'comparison for' fully completes. counts reps as 'tries'. <------------
for (i=0; i<Ndigits; i++) //Move forward through array. |
results[i]=(rand()%base); //assign random num of base to element (digit). |
/*for (i=0; i<Ndigits; i++) //---Debug Lines--------------- |
std::cout<<" "<<results[i]; //---a LOT of output.---------- |
std::cout << "\n"; //---Comment/decoment to disable/enable.*/ // |
for (i=Ndigits-1; i>0 && results[i]==results[0]; i--); //Move through array, != element breaks & i!=0, new digits drawn. -|
} //If all are equal i will be 0, nested for condition satisfied. -|
clock_gettime(CLOCK_PROCESS_CPUTIME_ID, &temp_time); //get time
draw_time = (timetodouble(temp_time) - start_time); //convert time to dbl, subtract start_time, set draw_time to diff.
total_time += draw_time; //add time for this pass to total.
total_tries += tries; //add tries for this pass to total.
/*Formated output for each pass:
Pass# ---: All -- base(--) digits = -- base(10) Time: ----.---- secs. Tries: ----- (LINE) */
std::cout<<"Pass# "<<std::setw(width_pass)<<pass_num<<": All "<<Ndigits<<" base("<<base<<") digits = "
<<std::setw(width_base)<<results[0]<<" base(10). Time: "<<std::setw(width_time)<<draw_time
<<" secs. Tries: "<<tries<<"\n";
}
if(passes==1) return 0; //No need for totals and averages of 1 pass.
/* It took ----.---- secs & ------ tries to find --- repeating -- digit base(--) numbers. (LINE)
An average of ---.---- secs & ---- tries was needed to find each one. (LINE)(LINE) */
std::cout<<"It took "<<total_time<<" secs & "<<total_tries<<" tries to find "
<<passes<<" repeating "<<Ndigits<<" digit base("<<base<<") numbers.\n"
<<"An average of "<<total_time/passes<<" secs & "<<total_tries/passes
<<" tries was needed to find each one. \n\n";
return 0;
答案 0 :(得分:5)
与管道相比,打印到屏幕非常慢或没有打印。用油管来阻止你做这件事。
答案 1 :(得分:2)
不是要打印到屏幕上;它是关于输出是终端(tty)。
打开时,标准错误流未完全缓冲;该 如果和,则标准输入和标准输出流完全缓冲 只有当流可以确定不参考交互式时 设备
Linux将此解释为在输出为tty(例如终端窗口)时使FILE *
(即stdio)stdout
行缓冲,否则进行块缓冲(例如管道)。
sync_with_stdio
产生影响的原因是,当它启用时,C ++ cout
流会继承此行为。当您将其设置为false
时,它不再受该行为的约束,因此变为块缓冲。
块缓冲更快,因为它避免了在每个换行符上刷新缓冲区的开销。
您可以通过管道cat
而不是grep
进一步验证这一点。不同之处在于管道本身,而不是屏幕本身。
答案 2 :(得分:0)
谢谢Collin&amp;尼莫。我确信这是因为我没有在开始和开始之间调用std :: cout。停止它不会产生影响的时间。不是这样。我认为这是由于编译器甚至使用-O0或'defaults'执行的优化。
我认为发生了什么......?我认为,正如Collin建议的那样,编译器试图在写入TTY时变得聪明。并且,正如Nemo指出的那样,cout继承了stdio的行缓冲属性。
我可以通过使用:
来减少效果,但不能消除std::cout.sync_with_stdio(false);
从我对此的有限阅读中,应该在任何输出操作完成之前调用它。 这是no_sync版本的来源:https://www.dropbox.com/s/wugo7hxvu9ao8i3/prng_bench_no_sync.cpp
./ no_sync 3 10 999999; ./ no_sync 3 10 999999 | grep take
使用-O0编译
999999: All 3 base(10) digits = 3 base(10) Time: 0.00004 secs. Tries: 23
It took 166.30801 secs & 99947208 tries to find 999999 repeating 3 digit base(10) numbers.
An average of 0.00017 secs & 99 tries was needed to find each one.
It took 163.72914 secs & 99947208 tries to find 999999 repeating 3 digit base(10) numbers.
符合-O3
999999: All 3 base(10) digits = 3 base(10) Time: 0.00003 secs. Tries: 23
It took 143.23234 secs & 99947208 tries to find 999999 repeating 3 digit base(10) numbers.
An average of 0.00014 secs & 99 tries was needed to find each one.
It took 140.36195 secs & 99947208 tries to find 999999 repeating 3 digit base(10) numbers.
指定不与stdio同步会将管道和非管道之间的增量从30秒更改为小于3 。看原始delta的原始问题是~191 - ~160
为了进一步测试,我使用struct创建了另一个版本来存储有关每个传递的统计信息。所有传递完成后,此方法将完成所有输出。我想强调一点,这可能是一个可怕的想法。我允许命令行参数来确定包含int,double和unsigned long的动态分配的结构数组的大小。我甚至无法以999,999次通过运行此版本。我遇到了分段错误。 https://www.dropbox.com/s/785ntsm622q9mwd/prng_bench_struct.cpp
./ struct_prng 3 10 99999; ./ struct_prng 3 10 99999 | grep take
Pass# 99999: All 3 base(10) digits = 6 base(10) Time: 0.00025 secs. Tries: 193
It took 13.10071 secs & 9970298 tries to find 99999 repeating 3 digit base(10) numbers.
An average of 0.00013 secs & 99 tries was needed to find each one.
It took 13.12466 secs & 9970298 tries to find 99999 repeating 3 digit base(10) numbers.
我从中学到的是,你不能指望你编写的东西是他们执行的顺序。在未来的程序中我可能会实现getopt而不是编写我自己的parse_args函数。这样我就可以通过要求用户在想要看到它时使用-v开关来抑制高重复循环中的无关输出。
我希望进一步的测试对任何想知道循环中的管道和输出的人都有用。我发布的所有结果都是在RasPi上获得的。链接的所有源代码都是GPL,只是因为这是我能想到的第一个许可证......我真的没有对GPL的copyleft规定的自我扩展需求,我只想清楚它是免费的,但没有保证或责任。
请注意,链接的所有源都会对srand(...)进行注释,因此所有伪随机结果都将完全相同。