当我对下面显示的代码进行计时时,会发生最奇怪的行为。它很长但它只是通过不同的传递参数的方式进行矩阵乘法。对于matmul0
我使用显式形状,对于matmul1
我使用假定的形状,对于matmul2
我使用假定的形状加上我使用子程序内的指针指向矩阵(不要问我为什么这样做,没关系)。问题是,当我计算三个子程序时,我会得到类似的东西:
Time explicit shape: 3.712099
Time assumed shape: 12.55620
Time assumed shape + pointer: 3.821299
现在,如果我评论第三个子程序(带指针的那个),这次我会得到类似的东西:
Time explicit shape: 3.712099
Time assumed shape: 3.824401
Time assumed shape + pointer: 0.00000
为什么会这样?为什么它也不会发生在第一个子程序中呢?我正在运行英特尔酷睿i3,英特尔编译器,没有优化标志,只有ifort test.f90 -fpp
(fpp
用于预处理定时器宏)。完整代码如下。
#define timer(func, store) call system_clock(start_t, rate); call func; call system_clock(end_t); store = store + real(end_t - start_t)/real(rate);
program test
interface
subroutine matmul1(A, B, C)
real :: A(:,:), B(:,:), C(:,:)
end subroutine
subroutine matmul2(A, B, C)
real, target :: A(:,:), B(:,:), C(:,:)
end subroutine
end interface
real, allocatable, dimension(:,:) :: A, B, C
integer, parameter :: m = 500, n = 500, o = 500
integer, parameter :: loops = 100
integer :: start_t, end_t, rate
real :: time
allocate(A(m,n), B(n,o), C(m,o))
A(:,:) = 1; B(:,:) = 2; C(:,:) = 0
time = 0
do i = 1, loops
timer(matmul0(A, B, C, m, n, o), time)
end do
print*, 'Time explicit shape:', time
time = 0
do i = 1, loops
timer(matmul1(A, B, C), time)
end do
print*, 'Time assumed shape:', time
time = 0
do i = 1, loops
! timer(matmul2(A, B, C), time)
end do
print*, 'Time assumed shape + pointer:', time
end program
subroutine matmul0(A, B, C, m, n, o)
integer :: m, n, o
real :: A(m,n), B(n,o), C(m,o)
do i = 1, m
do j = 1, o
do k = 1, n
C(i,j) = C(i,j) + A(i,k)*B(k,j)
end do
end do
end do
end subroutine
subroutine matmul1(A, B, C)
real :: A(:,:), B(:,:), C(:,:)
do i = 1, size(C,1)
do j = 1, size(C,2)
do k = 1, size(A,2)
C(i,j) = C(i,j) + A(i,k)*B(k,j)
end do
end do
end do
end subroutine
subroutine matmul2(A, B, C)
real, target :: A(:,:), B(:,:), C(:,:)
real, pointer :: A0(:,:), B0(:,:), C0(:,:)
A0 => A; B0 => B; C0 => C
do i = 1, size(C,1)
do j = 1, size(C,2)
do k = 1, size(A,2)
C0(i,j) = C0(i,j) + A0(i,k)*B0(k,j)
end do
end do
end do
end subroutine
答案 0 :(得分:1)
这似乎是ifort
和循环内size()
函数调用的问题......
在我的机器上,您的代码产生:
Time explicit shape: 3.145800
Time assumed shape: 14.98891
Time assumed shape + pointer: 14.82460
我重写了子程序以重用数组形状:
subroutine matmul1(A, B, C)
real,intent(in) :: A(:,:), B(:,:)
real,intent(out) :: C(:,:)
integer :: m, n, o
m = size(C,1)
n = size(C,2)
o = size(A,2)
do i = 1, m
do j = 1, n
do k = 1, o
C(i,j) = C(i,j) + A(i,k)*B(k,j)
end do
end do
end do
end subroutine
[matmul2
同样调整了......]
现在,我明白了:
Time explicit shape: 3.159100
Time assumed shape: 3.009802
Time assumed shape + pointer: 3.567401
有趣的是,gfortran
似乎没有这个问题。
注释掉第三个子程序调用对我机器上的任何编译器都没有影响,所以我无法帮助你:(