如何正确理解IMB基准测试结果

时间:2017-10-16 18:29:42

标签: mpi infiniband

Hello目前我正在使用Infiniband并使用IMB-benchmark测试性能,我目前正在测试并行传输测试 并且想知道结果确实反映了8个过程的并行性能。

对结果的解释太模糊了,让我无法理解。 由于在每个结果中都提到了(在MPI_Barrier中等待了6个额外的进程),我怀疑它每个只运行2个进程?

吞吐量列t_avg [usec]结果似乎得到了正确的结果,但我需要确保我正确理解。

#-----------------------------------------------------------------------------
# Benchmarking Sendrecv
# #processes = 8
#-----------------------------------------------------------------------------

这段话是否意味着我正在并行运行8个进程?

#-----------------------------------------------------------------------------
# Benchmarking Sendrecv
# #processes = 4
# ( 4 additional processes waiting in MPI_Barrier)
#-----------------------------------------------------------------------------

这段经文意味着4个进程并行运行? 感谢

非常感谢熟悉IMB基准测试的人的帮助

以下是

的完整结果
# np - 8
#------------------------------------------------------------
#    Intel (R) MPI Benchmarks 2018, MPI-1 part
#------------------------------------------------------------
# Date                  : Mon Oct 16 14:14:20 2017
# Machine               : x86_64
# System                : Linux
# Release               : 4.4.0-96-generic
# Version               : #119-Ubuntu SMP Tue Sep 12 14:59:54 UTC 2017
# MPI Version           : 3.0
# MPI Thread Environment:


# Calling sequence was:

# ./IMB-MPI1 Sendrecv Exchange

# Minimum message length in bytes:   0
# Maximum message length in bytes:   4194304
#
# MPI_Datatype                   :   MPI_BYTE
# MPI_Datatype for reductions    :   MPI_FLOAT
# MPI_Op                         :   MPI_SUM
#
#

# List of Benchmarks to run:

# Sendrecv
# Exchange

#-----------------------------------------------------------------------------
# Benchmarking Sendrecv
# #processes = 2
# ( 6 additional processes waiting in MPI_Barrier)
#-----------------------------------------------------------------------------
       #bytes #repetitions  t_min[usec]  t_max[usec]  t_avg[usec]   Mbytes/sec
            0         1000        13.85        13.85        13.85         0.00
            1         1000        12.22        12.22        12.22         0.16
            2         1000        10.08        10.08        10.08         0.40
            4         1000         9.43         9.43         9.43         0.85
            8         1000         8.89         8.91         8.90         1.80
           16         1000         8.70         8.71         8.71         3.67
           32         1000         9.00         9.00         9.00         7.11
           64         1000         8.82         8.82         8.82        14.51
          128         1000         8.90         8.90         8.90        28.77
          256         1000         8.98         8.98         8.98        56.99
          512         1000         9.78         9.78         9.78       104.75
         1024         1000        12.65        12.65        12.65       161.91
         2048         1000        18.31        18.32        18.31       223.63
         4096         1000        20.05        20.05        20.05       408.52
         8192         1000        21.15        21.16        21.16       774.11
        16384         1000        27.46        27.47        27.46      1193.05
        32768         1000        36.93        36.94        36.93      1774.31
        65536          640        60.56        60.59        60.57      2163.39
       131072          320       117.62       117.63       117.63      2228.57
       262144          160       202.67       202.68       202.67      2586.78
       524288           80       323.86       324.28       324.07      3233.56
      1048576           40       615.05       615.47       615.26      3407.42
      2097152           20      1214.74      1216.89      1215.82      3446.74
      4194304           10      2471.83      2488.45      2480.14      3371.02

#-----------------------------------------------------------------------------
# Benchmarking Sendrecv
# #processes = 4
# ( 4 additional processes waiting in MPI_Barrier)
#-----------------------------------------------------------------------------
       #bytes #repetitions  t_min[usec]  t_max[usec]  t_avg[usec]   Mbytes/sec
            0         1000        11.14        11.15        11.15         0.00
            1         1000        11.16        11.16        11.16         0.18
            2         1000        11.11        11.12        11.12         0.36
            4         1000        11.10        11.11        11.10         0.72
            8         1000        11.03        11.04        11.03         1.45
           16         1000        11.21        11.22        11.22         2.85
           32         1000        11.81        11.81        11.81         5.42
           64         1000        11.58        11.58        11.58        11.05
          128         1000        11.77        11.78        11.78        21.72
          256         1000        11.88        11.89        11.89        43.05
          512         1000        13.03        13.03        13.03        78.57
         1024         1000        14.73        14.74        14.74       138.92
         2048         1000        19.37        19.39        19.38       211.24
         4096         1000        21.31        21.34        21.33       383.96
         8192         1000        26.19        26.22        26.20       624.84
        16384         1000        32.65        32.69        32.67      1002.26
        32768         1000        48.71        48.78        48.75      1343.52
        65536          640        75.14        75.22        75.18      1742.63
       131072          320       174.66       175.15       174.94      1496.65
       262144          160       301.22       302.02       301.44      1735.95
       524288           80       539.40       542.68       540.78      1932.21
      1048576           40      1015.45      1026.34      1020.59      2043.32
      2097152           20      1959.53      1985.57      1971.34      2112.39
      4194304           10      3549.00      3641.61      3590.76      2303.55

#-----------------------------------------------------------------------------
# Benchmarking Sendrecv
# #processes = 8
#-----------------------------------------------------------------------------
       #bytes #repetitions  t_min[usec]  t_max[usec]  t_avg[usec]   Mbytes/sec
            0         1000        12.81        12.83        12.82         0.00
            1         1000        12.82        12.84        12.83         0.16
            2         1000        12.73        12.75        12.74         0.31
            4         1000        12.82        12.85        12.84         0.62
            8         1000        12.87        12.88        12.87         1.24
           16         1000        12.83        12.86        12.84         2.49
           32         1000        13.25        13.28        13.26         4.82
           64         1000        13.44        13.46        13.45         9.51
          128         1000        13.49        13.51        13.50        18.94
          256         1000        13.72        13.74        13.73        37.27
          512         1000        13.69        13.71        13.70        74.72
         1024         1000        15.73        15.75        15.74       130.07
         2048         1000        20.72        20.76        20.74       197.28
         4096         1000        22.68        22.74        22.72       360.28
         8192         1000        29.48        29.52        29.50       555.04
        16384         1000        39.89        39.95        39.92       820.31
        32768         1000        57.38        57.48        57.43      1140.24
        65536          640        95.23        95.34        95.29      1374.78
       131072          320       214.61       215.16       214.83      1218.38
       262144          160       365.75       368.39       367.28      1423.18
       524288           80       679.82       687.10       683.13      1526.08
      1048576           40      1277.18      1309.22      1295.65      1601.83
      2097152           20      2292.99      2377.56      2339.35      1764.12
      4194304           10      4617.95      4919.67      4778.37      1705.12

#-----------------------------------------------------------------------------
# Benchmarking Exchange
# #processes = 2
# ( 6 additional processes waiting in MPI_Barrier)
#-----------------------------------------------------------------------------
       #bytes #repetitions  t_min[usec]  t_max[usec]  t_avg[usec]   Mbytes/sec
            0         1000        12.41        12.42        12.42         0.00
            1         1000        12.47        12.48        12.47         0.32
            2         1000        11.93        11.94        11.94         0.67
            4         1000        11.95        11.96        11.95         1.34
            8         1000        11.91        11.92        11.92         2.69
           16         1000        11.97        11.98        11.97         5.34
           32         1000        12.80        12.81        12.80        10.00
           64         1000        12.84        12.84        12.84        19.93
          128         1000        12.90        12.91        12.91        39.67
          256         1000        12.90        12.91        12.91        79.34
          512         1000        14.04        14.04        14.04       145.82
         1024         1000        17.13        17.14        17.13       239.02
         2048         1000        21.06        21.06        21.06       389.05
         4096         1000        23.32        23.33        23.32       702.41
         8192         1000        28.07        28.07        28.07      1167.45
        16384         1000        37.81        37.82        37.82      1732.64
        32768         1000        55.23        55.24        55.24      2372.75
        65536          640       101.04       101.06       101.05      2593.84
       131072          320       212.88       212.88       212.88      2462.84
       262144          160       362.37       362.38       362.37      2893.62
       524288           80       668.88       668.89       668.88      3135.26
      1048576           40      1286.48      1287.81      1287.15      3256.92
      2097152           20      2463.56      2464.13      2463.84      3404.29
      4194304           10      4845.24      4854.75      4849.99      3455.83

#-----------------------------------------------------------------------------
# Benchmarking Exchange
# #processes = 4
# ( 4 additional processes waiting in MPI_Barrier)
#-----------------------------------------------------------------------------
       #bytes #repetitions  t_min[usec]  t_max[usec]  t_avg[usec]   Mbytes/sec
            0         1000        16.46        16.46        16.46         0.00
            1         1000        16.42        16.43        16.42         0.24
            2         1000        16.17        16.17        16.17         0.49
            4         1000        16.17        16.17        16.17         0.99
            8         1000        16.19        16.20        16.20         1.98
           16         1000        16.21        16.22        16.22         3.94
           32         1000        17.20        17.21        17.20         7.44
           64         1000        17.09        17.10        17.10        14.97
          128         1000        17.24        17.25        17.25        29.68
          256         1000        17.40        17.41        17.40        58.83
          512         1000        17.59        17.61        17.60       116.32
         1024         1000        21.43        21.45        21.44       190.95
         2048         1000        29.49        29.50        29.49       277.71
         4096         1000        31.63        31.66        31.64       517.58
         8192         1000        36.70        36.72        36.71       892.41
        16384         1000        49.50        49.53        49.52      1323.07
        32768         1000        68.35        68.36        68.36      1917.38
        65536          640       108.80       108.85       108.82      2408.31
       131072          320       314.38       314.72       314.56      1665.91
       262144          160       521.71       522.24       521.94      2007.84
       524288           80       930.03       933.47       931.82      2246.62
      1048576           40      1729.81      1738.30      1734.66      2412.87
      2097152           20      3384.33      3414.99      3403.61      2456.41
      4194304           10      6972.50      7058.12      7028.16      2377.01

#-----------------------------------------------------------------------------
# Benchmarking Exchange
# #processes = 8
#-----------------------------------------------------------------------------
       #bytes #repetitions  t_min[usec]  t_max[usec]  t_avg[usec]   Mbytes/sec
            0         1000        18.91        18.93        18.92         0.00
            1         1000        19.06        19.08        19.07         0.21
            2         1000        18.91        18.92        18.92         0.42
            4         1000        19.07        19.09        19.08         0.84
            8         1000        18.81        18.83        18.82         1.70
           16         1000        19.02        19.03        19.03         3.36
           32         1000        19.85        19.85        19.85         6.45
           64         1000        19.76        19.78        19.77        12.94
          128         1000        19.94        19.96        19.95        25.65
          256         1000        20.16        20.18        20.17        50.75
          512         1000        20.50        20.51        20.50        99.86
         1024         1000        24.52        24.55        24.54       166.83
         2048         1000        36.35        36.39        36.37       225.14
         4096         1000        38.77        38.81        38.79       422.20
         8192         1000        44.79        44.82        44.81       731.12
        16384         1000        59.28        59.33        59.31      1104.68
        32768         1000        86.39        86.47        86.42      1515.87
        65536          640       142.47       142.60       142.53      1838.29
       131072          320       402.11       402.98       402.57      1301.04
       262144          160       648.90       650.30       649.68      1612.44
       524288           80      1209.17      1213.71      1211.74      1727.89
      1048576           40      2332.69      2355.17      2344.35      1780.89
      2097152           20      4686.88      4767.48      4733.77      1759.55
      4194304           10      9457.18      9674.69      9567.31      1734.13


# All processes entering MPI_Finalize

1 个答案:

答案 0 :(得分:1)

一次IMB基准测试

  • 各种MPI子例程(此处MPI_SendrecvMPI_Exchange
  • 各种邮件大小(此处从04MB
  • 各种传播者尺寸(此处为248

由于使用mpirun调用-np 8一次,这意味着创建了8个MPI任务。 因此,在测试大小2通信器时,会在引擎盖下创建一个额外大小的6通信器,其6个MPI任务只挂在MPI_Barrier中,因此消息< / p>

# #processes = 2
# ( 6 additional processes waiting in MPI_Barrier)