我正在尝试评估现有基准测试和其他应用程序的火箭核心性能。
在我使用基准代码MT-MATMUL运行模拟器并查看mt-matmul.riscv.out之后,我注意到有很多停顿。有人可以向我解释我怎样才能找出失速的原因?我认为应该预测这个简单的循环,并且不应该有很多停顿,这对处理器来说确实很慢。
见下面的日志:
C0:1956 [1] pc = [0000000628] W [r 0 = 0000000000001418] [0] R [r 5 = 0000000000001418] R [r 0 = 0000000000000000] inst = [0002b107] fld ft2,0(t0)
C0:1957 [1] pc = [000000062c] W [r 0 = 0000000000007730] [0] R [r16 = 0000000000007730] R [r 0 = 0000000000000000] inst = [00083007] fld ft0,0(a6)< / p>
C0:1958 [1] pc = [0000000630] W [r 0 = 0000000000003380] [0] R [r 6 = 0000000000003380] R [r 0 = 0000000000000000] inst = [00033087] fld ft1,0(t1)
** C0:1959 [1] pc = [0000000634] W [r16 = 0000000000007738] [1] R [r16 = 0000000000007730] R [r 8 = 0000000000000013] inst = [00880813] addi a6,a6,8 < / p>
C0:1960 [0] pc = [0000000634] W [r 0 = 0000000000007738] [0] R [r16 = 0000000000007730] R [r 8 = 0000000000000013] inst = [00880813] addi a6,a6,8
C0:1961 [0] pc = [0000000634] W [r 0 = 0000000000007738] [0] R [r16 = 0000000000007730] R [r 8 = 0000000000000013] inst = [00880813] addi a6,a6,8 **
C0:1962 [1] pc = [0000000638] W [r17 = 0000000000000014] [1] R [r17 = 0000000000000013] R [r 1 = 0000000000000013] inst = [0018889b] addiw a7,a7,1
C0:1963 [1] pc = [000000063c] W [r 0 = 0000000000000001] [0] R [r 1 = 0000000000000013] R [r 2 = 0000000000000013] inst = [0220f043] fmadd.d ft0,ft1, ft2,ft0
C0:1964 [1] pc = [0000000640] W [r 5 = 0000000000001420] [1] R [r 5 = 0000000000001418] R [r 8 = 0000000000000013] inst = [00828293] addi t0,t0,8 < / p>
C0:1965 [0] pc = [0000000640] W [r 0 = 0000000000001420] [0] R [r 5 = 0000000000001418] R [r 8 = 0000000000000013] inst = [00828293] addi t0,t0,8 < / p>
C0:1966 [0] pc = [0000000640] W [r 0 = 0000000000001420] [0] R [r 5 = 0000000000001418] R [r 8 = 0000000000000013] inst = [00828293] addi t0,t0,8 < / p>
C0:1967 [1] pc = [0000000644] W [r 0 = 0000000000007730] [0] R [r16 = 0000000000007738] R [r 0 = 0000000000000000] inst = [fe083c27] fsd ft0,-8(a6)
C0:1968 [1] pc = [0000000648] W [r 0 = 0000000000000001] [0] R [r12 = 0000000000000020] R [r17 = 0000000000000014] inst = [ff1610e3] bne a2,a7,pc - 32 < / p>
C0:1969 [1] pc = [0000000628] W [r 0 = 0000000000001420] [0] R [r 5 = 0000000000001420] R [r 0 = 0000000000000000] inst = [0002b107] fld ft2,0(t0)
C0:1970 [1] pc = [000000062c] W [r 0 = 0000000000007738] [0] R [r16 = 0000000000007738] R [r 0 = 0000000000000000] inst = [00083007] fld ft0,0(a6)< / p>
C0:1971 [1] pc = [0000000630] W [r 0 = 0000000000003380] [0] R [r 6 = 0000000000003380] R [r 0 = 0000000000000000] inst = [00033087] fld ft1,0(t1)
C0:1972 [1] pc = [0000000634] W [r16 = 0000000000007740] [1] R [r16 = 0000000000007738] R [r 8 = 0000000000000017] inst = [00880813] addi a6,a6,8
C0:1973 [0] pc = [0000000634] W [r 0 = 0000000000007740] [0] R [r16 = 0000000000007738] R [r 8 = 0000000000000017] inst = [00880813] addi a6,a6,8
C0:1974 [0] pc = [0000000634] W [r 0 = 0000000000007740] [0] R [r16 = 0000000000007738] R [r 8 = 0000000000000017] inst = [00880813] addi a6,a6,8
C0:1975 [1] pc = [0000000638] W [r17 = 0000000000000015] [1] R [r17 = 0000000000000014] R [r 1 = 0000000000000017] inst = [0018889b] addiw a7,a7,1
** C0:1982 [1] pc = [0000000628] W [r 0 = 0000000000001428] [0] R [r 5 = 0000000000001428] R [r 0 = 0000000000000000] inst = [0002b107] fld ft2,0( T0)
C0:1983 [1] pc = [000000062c] W [r 0 = 0000000000007740] [0] R [r16 = 0000000000007740] R [r 0 = 0000000000000000] inst = [00083007] fld ft0,0(a6)< / p>
C0:1984 [1] pc = [0000000630] W [r 0 = 0000000000003380] [0] R [r 6 = 0000000000003380] R [r 0 = 0000000000000000] inst = [00033087] fld ft1,0(t1)
C0:1985 [1] pc = [0000000634] W [r16 = 0000000000007748] [1] R [r16 = 0000000000007740] R [r 8 = 0000000000000017] inst = [00880813] addi a6,a6,8
C0:1986 [0] pc = [0000000634] W [r 0 = 0000000000007748] [0] R [r16 = 0000000000007740] R [r 8 = 0000000000000017] inst = [00880813] addi a6,a6,8
C0:1987 [0] pc = [0000000634] W [r 0 = 0000000000007748] [0] R [r16 = 0000000000007740] R [r 8 = 0000000000000017] inst = [00880813] addi a6,a6,8
C0:1988 [1] pc = [0000000638] W [r17 = 0000000000000016] [1] R [r17 = 0000000000000015] R [r 1 = 0000000000000017] inst = [0018889b] addiw a7,a7,1
C0:1989 [0] pc = [0000000638] W [r 0 = 0000000000000016] [0] R [r17 = 0000000000000015] R [r 1 = 0000000000000017] inst = [0018889b] addiw a7,a7,1
C0:1990 [0] pc = [0000000638] W [r 0 = 0000000000000016] [0] R [r17 = 0000000000000015] R [r 1 = 0000000000000017] inst = [0018889b] addiw a7,a7,1
C0:1991 [0] pc = [0000000638] W [r 0 = 0000000000000016] [0] R [r17 = 0000000000000015] R [r 1 = 0000000000000017] inst = [0018889b] addiw a7,a7,1
C0:1992 [0] pc = [0000000638] W [r 0 = 0000000000000016] [0] R [r17 = 0000000000000015] R [r 1 = 0000000000000017] inst = [0018889b] addiw a7,a7,1
C0:1993 [0] pc = [0000000638] W [r 0 = 0000000000000016] [0] R [r17 = 0000000000000015] R [r 1 = 0000000000000017] inst = [0018889b] addiw a7,a7,1
C0:1994 [0] pc = [0000000638] W [r 0 = 0000000000000016] [0] R [r17 = 0000000000000015] R [r 1 = 0000000000000017] inst = [0018889b] addiw a7,a7,1
C0:1995 [0] pc = [0000000638] W [r 0 = 0000000000000016] [0] R [r17 = 0000000000000015] R [r 1 = 0000000000000017] inst = [0018889b] addiw a7,a7,1
C0:1996 [0] pc = [0000000638] W [r 0 = 0000000000000016] [0] R [r17 = 0000000000000015] R [r 1 = 0000000000000017] inst = [0018889b] addiw a7,a7,1
C0:1997 [0] pc = [0000000638] W [r 0 = 0000000000000016] [0] R [r17 = 0000000000000015] R [r 1 = 0000000000000017] inst = [0018889b] addiw a7,a7,1
C0:1998 [0] pc = [0000000638] W [r 0 = 0000000000000016] [0] R [r17 = 0000000000000015] R [r 1 = 0000000000000017] inst = [0018889b] addiw a7,a7,1
C0:1999 [0] pc = [0000000638] W [r 0 = 0000000000000016] [0] R [r17 = 0000000000000015] R [r 1 = 0000000000000017] inst = [0018889b] addiw a7,a7,1
C0:2000 [0] pc = [0000000638] W [r 0 = 0000000000000016] [0] R [r17 = 0000000000000015] R [r 1 = 0000000000000017] inst = [0018889b] addiw a7,a7,1
C0:2001 [0] pc = [0000000638] W [r 0 = 0000000000000016] [0] R [r17 = 0000000000000015] R [r 1 = 0000000000000017] inst = [0018889b] addiw a7,a7,1
C0:2002 [0] pc = [0000000638] W [r 0 = 0000000000000016] [0] R [r17 = 0000000000000015] R [r 1 = 0000000000000017] inst = [0018889b] addiw a7,a7,1
C0:2003 [0] pc = [0000000638] W [r 0 = 0000000000000016] [0] R [r17 = 0000000000000015] R [r 1 = 0000000000000017] inst = [0018889b] addiw a7,a7,1
C0:2004 [0] pc = [0000000638] W [r 0 = 0000000000000016] [0] R [r17 = 0000000000000015] R [r 1 = 0000000000000017] inst = [0018889b] addiw a7,a7,1
C0:2005 [0] pc = [0000000638] W [r 0 = 0000000000000016] [0] R [r17 = 0000000000000015] R [r 1 = 0000000000000017] inst = [0018889b] addiw a7,a7,1
C0:2006 [0] pc = [0000000638] W [r 0 = 0000000000000016] [0] R [r17 = 0000000000000015] R [r 1 = 0000000000000017] inst = [0018889b] addiw a7,a7,1
C0:2007 [0] pc = [0000000638] W [r 0 = 0000000000000016] [0] R [r17 = 0000000000000015] R [r 1 = 0000000000000017] inst = [0018889b] addiw a7,a7,1
C0:2008 [0] pc = [0000000638] W [r 0 = 0000000000000016] [0] R [r17 = 0000000000000015] R [r 1 = 0000000000000017] inst = [0018889b] addiw a7,a7,1
C0:2009 [0] pc = [0000000638] W [r 0 = 0000000000000016] [0] R [r17 = 0000000000000015] R [r 1 = 0000000000000017] inst = [0018889b] addiw a7,a7,1
C0:2010 [0] pc = [0000000638] W [r 0 = 0000000000000016] [0] R [r17 = 0000000000000015] R [r 1 = 0000000000000017] inst = [0018889b] addiw a7,a7,1
C0:2011 [0] pc = [0000000638] W [r 0 = 0000000000000016] [0] R [r17 = 0000000000000015] R [r 1 = 0000000000000017] inst = [0018889b] addiw a7,a7,1
C0:2012 [0] pc = [0000000638] W [r 0 = 0000000000000016] [0] R [r17 = 0000000000000015] R [r 1 = 0000000000000017] inst = [0018889b] addiw a7,a7,1
C0:2013 [0] pc = [0000000638] W [r 0 = 0000000000000016] [0] R [r17 = 0000000000000015] R [r 1 = 0000000000000017] inst = [0018889b] addiw a7,a7,1
C0:2014 [0] pc = [0000000638] W [r 0 = 0000000000000016] [0] R [r17 = 0000000000000015] R [r 1 = 0000000000000017] inst = [0018889b] addiw a7,a7,1
C0:2015 [0] pc = [0000000638] W [r 0 = 0000000000000016] [0] R [r17 = 0000000000000015] R [r 1 = 0000000000000017] inst = [0018889b] addiw a7,a7,1
C0:2016 [0] pc = [0000000638] W [r 0 = 0000000000000016] [0] R [r17 = 0000000000000015] R [r 1 = 0000000000000017] inst = [0018889b] addiw a7,a7,1
C0:2017 [0] pc = [0000000638] W [r 0 = 0000000000000016] [0] R [r17 = 0000000000000015] R [r 1 = 0000000000000017] inst = [0018889b] addiw a7,a7,1
C0:2018 [0] pc = [0000000638] W [r 0 = 0000000000000016] [0] R [r17 = 0000000000000015] R [r 1 = 0000000000000017] inst = [0018889b] addiw a7,a7,1
C0:2019 [0] pc = [0000000638] W [r 0 = 0000000000000016] [0] R [r17 = 0000000000000015] R [r 1 = 0000000000000017] inst = [0018889b] addiw a7,a7,1
C0:2020 [0] pc = [0000000638] W [r 0 = 0000000000000016] [0] R [r17 = 0000000000000015] R [r 1 = 0000000000000017] inst = [0018889b] addiw a7,a7,1
C0:2021 [0] pc = [0000000638] W [r 0 = 0000000000000016] [0] R [r17 = 0000000000000015] R [r 1 = 0000000000000017] inst = [0018889b] addiw a7,a7,1 **
汇编代码很简单
5e8: 02b645bb divw a1,a2,a1
5ec: 02a5853b mulw a0,a1,a0
5f0: 00a585bb addw a1,a1,a0
5f4: 06b55e63 ble a1,a0,670 <matmul+0x88>
5f8: 02a6083b mulw a6,a2,a0
5fc: 00361e93 slli t4,a2,0x3
600: 00381813 slli a6,a6,0x3
604: 010787b3 add a5,a5,a6
608: 010686b3 add a3,a3,a6
60c: 04c05863 blez a2,65c <matmul+0x74>
610: 00070e13 mv t3,a4
614: 00068313 mv t1,a3
618: 00000393 li t2,0
61c: 000e0293 mv t0,t3
620: 00078813 mv a6,a5
624: 00000893 li a7,0
628: 0002b107 fld ft2,0(t0)
62c: 00083007 fld ft0,0(a6)
630: 00033087 fld ft1,0(t1) # 3000 <input2_data+0x1c80>
634: 00880813 addi a6,a6,8
638: 0018889b addiw a7,a7,1
63c: 0220f043 fmadd.d ft0,ft1,ft2,ft0
640: 00828293 addi t0,t0,8
644: fe083c27 fsd ft0,-8(a6)
648: ff1610e3 bne a2,a7,628 <matmul+0x40>
64c: 0013839b addiw t2,t2,1
650: 01de0e33 add t3,t3,t4
654: 00830313 addi t1,t1,8
658: fc7612e3 bne a2,t2,61c <matmul+0x34>
65c: 0015051b addiw a0,a0,1
660: 01d787b3 add a5,a5,t4
664: 01d686b3 add a3,a3,t4
668: fab512e3 bne a0,a1,60c <matmul+0x24>
答案 0 :(得分:0)
要确定停顿的原因,特别是从输出日志中不容易。对于大多数处理器而言,这也是如此。处理器可能由于多种原因而停止,并且最常见的原因是缓存或TLB未命中。
您需要修改rocket以添加调试语句以进行模拟,甚至可以添加更多性能计数器来跟踪这些停顿事件。无论哪种方式,这将是一项大量的工作。