定时子程序时Fortran奇怪的行为

时间:2014-08-17 20:57:38

标签: performance fortran behavior intel-fortran

当我对下面显示的代码进行计时时,会发生最奇怪的行为。它很长但它只是通过不同的传递参数的方式进行矩阵乘法。对于matmul0我使用显式形状,对于matmul1我使用假定的形状,对于matmul2我使用假定的形状加上我使用子程序内的指针指向矩阵(不要问我为什么这样做,没关系)。问题是,当我计算三个子程序时,我会得到类似的东西:

Time explicit shape: 3.712099
Time assumed shape: 12.55620
Time assumed shape + pointer: 3.821299

现在,如果我评论第三个子程序(带指针的那个),这次我会得到类似的东西:

Time explicit shape: 3.712099
Time assumed shape: 3.824401
Time assumed shape + pointer: 0.00000

为什么会这样?为什么它也不会发生在第一个子程序中呢?我正在运行英特尔酷睿i3,英特尔编译器,没有优化标志,只有ifort test.f90 -fppfpp用于预处理定时器宏)。完整代码如下。

#define timer(func, store) call system_clock(start_t, rate); call func; call system_clock(end_t); store  = store + real(end_t - start_t)/real(rate);
program test
    interface
        subroutine matmul1(A, B, C)
            real :: A(:,:), B(:,:), C(:,:)
        end subroutine
        subroutine matmul2(A, B, C)
            real, target :: A(:,:), B(:,:), C(:,:)
        end subroutine
    end interface
    real, allocatable, dimension(:,:) :: A, B, C
    integer, parameter :: m = 500, n = 500, o = 500
    integer, parameter :: loops = 100
    integer :: start_t, end_t, rate
    real :: time
    allocate(A(m,n), B(n,o), C(m,o))
    A(:,:) = 1; B(:,:) = 2; C(:,:) = 0

    time = 0
    do i = 1, loops
    timer(matmul0(A, B, C, m, n, o), time)
    end do
    print*, 'Time explicit shape:', time

    time = 0
    do i = 1, loops
    timer(matmul1(A, B, C), time)
    end do
    print*, 'Time assumed shape:', time

    time = 0
    do i = 1, loops
    ! timer(matmul2(A, B, C), time)
    end do
    print*, 'Time assumed shape + pointer:', time

end program

subroutine matmul0(A, B, C, m, n, o)
    integer :: m, n, o
    real :: A(m,n), B(n,o), C(m,o)
    do i = 1, m
        do j = 1, o
            do k = 1, n
                C(i,j) = C(i,j) + A(i,k)*B(k,j)
            end do
        end do
    end do
end subroutine

subroutine matmul1(A, B, C)
    real :: A(:,:), B(:,:), C(:,:)
    do i = 1, size(C,1)
        do j = 1, size(C,2)
            do k = 1, size(A,2)
                C(i,j) = C(i,j) + A(i,k)*B(k,j)
            end do
        end do
    end do
end subroutine

subroutine matmul2(A, B, C)
    real, target :: A(:,:), B(:,:), C(:,:)
    real, pointer :: A0(:,:), B0(:,:), C0(:,:)
    A0 => A; B0 => B; C0 => C
    do i = 1, size(C,1)
        do j = 1, size(C,2)
            do k = 1, size(A,2)
                C0(i,j) = C0(i,j) + A0(i,k)*B0(k,j)
            end do
        end do
    end do
end subroutine

1 个答案:

答案 0 :(得分:1)

这似乎是ifort和循环内size()函数调用的问题......

在我的机器上,您的代码产生:

 Time explicit shape:   3.145800
 Time assumed shape:   14.98891
 Time assumed shape + pointer:   14.82460

我重写了子程序以重用数组形状:

subroutine matmul1(A, B, C)
    real,intent(in) :: A(:,:), B(:,:)
    real,intent(out) :: C(:,:)
    integer :: m, n, o

    m = size(C,1)
    n = size(C,2)
    o = size(A,2)
    do i = 1, m
        do j = 1, n
            do k = 1, o
                C(i,j) = C(i,j) + A(i,k)*B(k,j)
            end do
        end do
    end do
end subroutine

[matmul2同样调整了......]

现在,我明白了:

 Time explicit shape:   3.159100
 Time assumed shape:   3.009802
 Time assumed shape + pointer:   3.567401

有趣的是,gfortran似乎没有这个问题。

注释掉第三个子程序调用对我机器上的任何编译器都没有影响,所以我无法帮助你:(