为什么更频繁地读取磁盘会使Linux上的每次读取速度更快? QPS1与50

时间:2012-03-21 03:35:51

标签: linux io

我在使用SATA磁盘的linux机箱上对同步读取性能进行了基准测试。使用单线程读取时,奇怪的是较高的QPS(50)在读取300个条目后给出了12ms的平均读取时间,而较低的QPS(1)在读取相同的300个条目后给出了63ms。有一些解释吗?

代码和数据如下:

struct Request{
    unsigned long long len;
    unsigned long long offset;
    int                fd;
};

int read_request(Request* request){
    char* buf = (char*)malloc(request->len);
    off_t of = lseek(request->fd,request->offset,SEEK_SET);
    assert(of == request->offset);

    int len = read(request->fd,buf,request->len);
    assert(len == request->len);
    free(buf);
    return 0;
}




 int read_with_qps(Request* request,int request_num,Files* f,int mode,int qps){

    int interval = 1000 / qps;
    struct timeval start,end;
    for(int i  = 0 ; i < request_num ; i++){
        gettimeofday(&start,NULL);
        int ret = read_request(&request[i]);
        gettimeofday(&end,NULL);
        int time_used = (end.tv_sec - start.tv_sec) * 1000 + (end.tv_usec - start.tv_usec)/1000;
        fprintf(stderr,"%lld,offset=%lld,len=%lld, read time:%d,ret=%d,mode=%d\n",
                end.tv_sec,request[i].offset,request[i].len,time_used,ret,mode);
        if(time_used < interval){
            usleep((interval - time_used) * 1000);
        }
    }
    return 0;
}

当QPS = 50时,输出样本看起来像(忽略时间&lt; 4ms,在计算平均时间时被认为是命中页缓存):

1332233329,offset=1052299215,len=13186, read time:13,ret=0,mode=1
1332233329,offset=2319646140,len=1612, read time:10,ret=0,mode=1
1332233330,offset=1319250005,len=5654, read time:12,ret=0,mode=1
1332233330,offset=2520376009,len=2676, read time:12,ret=0,mode=1
1332233330,offset=2197548522,len=17236, read time:10,ret=0,mode=1
1332233330,offset=1363242083,len=13734, read time:11,ret=0,mode=1
1332233330,offset=4242210521,len=2003, read time:17,ret=0,mode=1
1332233330,offset=1666337117,len=2207, read time:10,ret=0,mode=1
1332233330,offset=797722662,len=5480, read time:18,ret=0,mode=1
1332233330,offset=1129310678,len=2265, read time:10,ret=0,mode=1

QPS = 1,相同的提取物:

1332300410,offset=1052299215,len=13186, read time:19,ret=0,mode=1
1332300411,offset=2319646140,len=1612, read time:40,ret=0,mode=1
1332300412,offset=1319250005,len=5654, read time:141,ret=0,mode=1
1332300413,offset=2520376009,len=2676, read time:15,ret=0,mode=1
1332300414,offset=2197548522,len=17236, read time:21,ret=0,mode=1
1332300415,offset=1363242083,len=13734, read time:13,ret=0,mode=1
1332300416,offset=4242210521,len=2003, read time:43,ret=0,mode=1
1332300417,offset=1666337117,len=2207, read time:18,ret=0,mode=1
1332300418,offset=797722662,len=5480, read time:67,ret=0,mode=1
1332300419,offset=1129310678,len=2265, read time:12,ret=0,mode=1

内核版本是:2.6.18-194.el5 SMP x86_64

$ cat /sys/block/sda/queue/scheduler
noop anticipatory deadline [cfq]

感谢您的回复

2 个答案:

答案 0 :(得分:1)

当您发出一堆查询时,驱动器固件可以将它们排队并根据旋转位置和头位置(“电梯搜索”)以优化顺序执行它们,因此它不必等待完整的搜索时间或每个I / O请求的磁盘轮换时间。

如果您慢慢发出相同的查询,则没有这样的优势。

答案 1 :(得分:1)

blktrace可以肯定地告诉您,但这可能是plugging造成的。简而言之,IO请求在被发送到磁盘之前可能会稍微延迟,这在许多请求进入并且可以合并时是有益的,但不是那么多。