We've been measuring the bandwidth of two external HDDs in class using gettimeofday
.
Surprisingly enough, and after several repeats (three measurements per execution, executed three times), we found that writing a 2500MB file was faster than writing a smaller one on both HDDs.
This is our C code. It is called from a python script to generate some charts.
//argv[1] = path, argv[2] = size in MB (2500 in this case)
#include <stdio.h>
#include <sys/time.h>
#include <time.h>
#include <unistd.h>
#include <fcntl.h>
struct timeval tv0;
struct timeval tv1;
int main(int argc, char *argv[]){
unsigned long size=atoi(argv[2])*1000L*1000L;
int f = open(argv[1], O_CREAT|O_WRONLY|O_TRUNC, 0777);
char * array = malloc(size);
gettimeofday(&tv0, 0); //START TIME
write(f, array, size);
fdatasync(f);
close(f);
gettimeofday(&tv1, 0); // END TIME
double seconds = (((double)tv1.tv_sec*1000000.0 + (double)tv1.tv_usec) - ((double)tv0.tv_sec*1000000.0 + (double)tv0.tv_usec))/1000000.0;
printf("%f",seconds);
}
The teacher didn't know, so I'm asking here: is there a reason why this might happen?
答案 0 :(得分:1)
您的基准存在严重缺陷:
write()
将写入指定给它的完整字节数,但绝不保证这样做。如果您的假设不满足,那么其中任何一个都很容易使您的基准测试结果无效,并且至少第二个很可能会以这种方式结束。
特别注意write()
返回写入的字节数,为ssize_t
。 ssize_t
是有符号整数类型,其特定宽度取决于系统。如果你的大小是32位,那么write()
不能在一次调用中写入所有2500MB缓冲区,因为这比有符号的32位整数可以表示的字节多(限制为略高于2100 MB)。
此外,您的程序假定它可以成功分配非常大的内存块,这可能很容易变成不是这种情况。但是,如果这个假设失败了,你可能会因崩溃而获得奖励。