使用scylla_setup时,iotune研究我的重用方法是:
Measuring sequential write bandwidth: 473 MB/s
Measuring sequential read bandwidth: 499 MB/s
Measuring random write IOPS: 1902 IOPS
Measuring random read IOPS: 1999 IOPS
iops是1900-2000,
使用fio时,
fio --randrepeat=1 --ioengine=libaio --direct=1 --gtod_reduce=1 --name=test --filename=/dev/sdc1 --bs=4k --iodepth=64 --size=4G --readwrite=randrw --rwmixread=75
结果是
test: (groupid=0, jobs=1): err= 0: pid=11697: Wed Jun 26 08:58:13 2019
read: IOPS=47.6k, BW=186MiB/s (195MB/s)(3070MiB/16521msec)
bw ( KiB/s): min=187240, max=192136, per=100.00%, avg=190278.42, stdev=985.15, samples=33
iops : min=46810, max=48034, avg=47569.61, stdev=246.38, samples=33
write: IOPS=15.9k, BW=62.1MiB/s (65.1MB/s)(1026MiB/16521msec)
bw ( KiB/s): min=62656, max=65072, per=100.00%, avg=63591.52, stdev=590.96, samples=33
iops : min=15664, max=16268, avg=15897.88, stdev=147.74, samples=33
cpu : usr=4.82%, sys=12.81%, ctx=164053, majf=0, minf=23
IO depths : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=0.1%, 32=0.1%, >=64=100.0%
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.1%, >=64=0.0%
issued rwt: total=785920,262656,0, short=0,0,0, dropped=0,0,0
latency : target=0, window=0, percentile=100.00%, depth=64
Run status group 0 (all jobs):
READ: bw=186MiB/s (195MB/s), 186MiB/s-186MiB/s (195MB/s-195MB/s), io=3070MiB (3219MB), run=16521-16521msec
WRITE: bw=62.1MiB/s (65.1MB/s), 62.1MiB/s-62.1MiB/s (65.1MB/s-65.1MB/s), io=1026MiB (1076MB), run=16521-16521msec
Disk stats (read/write):
sdc: ios=780115/260679, merge=0/0, ticks=792798/230409, in_queue=1023170, util=99.47%
读取iops是46000-48000,写入iops是15000-16000
答案 0 :(得分:0)
(注意:发问者也将其作为Scylla Github问题提起-https://github.com/scylladb/scylla/issues/4604)
[为什么] scylla_setup iotune中的磁盘iops与fio测试数据不同
不同的基准,不同的结果:
通常,很难使用不同的工具来完成基准测试并将它们直接相互比较-您需要知道它们在幕后所做的一切,才能使任何比较有意义。尝试在没有更多上下文的情况下隔离地查看IOPS或带宽是没有意义的,因为您通常会相互权衡。最好使用具有相同选项的相同基准测试工具来比较两个不同的机器更改,或者衡量调整对同一台机器的影响。
TLDR;这可能是苹果与橘子的比较,其中工具正在测量不同的上下文。
PS:gtod_reduce
是一条走得更快的条纹,很少有人真正需要。如果您的硬件无法执行每秒千兆字节的任务,并且您没有看到CPU过载,那么减少gettimeofday
调用就不太可能使结果微不足道。
(此问题可能更适合Server Fault(并在那里得到更好的答复),因为它与编程不直接相关)