我有3个嵌套循环:
!$omp parallel do schedule(runtime) private(s1)
DO k = 0, z
!$omp simd collapse( 2 ) reduction( +: s1 )
DO i = 0, x
DO j = 0, z
s1 = s1 + array(k,j,i)
ENDDO
ENDDO
sums_l(k) = s1
ENDDO
但是英特尔编译器抱怨"警告#13379:循环没有用" simd"" 这是为什么?我该怎么做呢?
// EDIT3:这是产生错误的代码。它被减少到仍然导致错误的最小值。如果你删除任何字面意思,它会矢量化。
SUBROUTINE simdTest
IMPLICIT NONE
INTEGER :: i, j, k, sr, tn,nzb,nzt,nxl,nxr,nys,nyn
REAL :: s1, s2, s3, s4
REAL, DIMENSION(:,:,:), ALLOCATABLE :: u,v,pt,rmask,sums_l
REAL, DIMENSION(:,:), ALLOCATABLE :: usws,vsws,shf
!$omp parallel do schedule(runtime) private(s1,s2,s3)
DO k = nzb, nzt+1
!$omp simd collapse( 2 ) reduction( +: s1, s2, s3 )
DO i = nxl, nxr
DO j = nys, nyn
s1 = s1 + u(k,j,i) * rmask(j,i,sr)
s2 = s2 + v(k,j,i) * rmask(j,i,sr)
s3 = s3 + pt(k,j,i) * rmask(j,i,sr)
ENDDO
ENDDO
sums_l(k,1,tn) = s1
sums_l(k,2,tn) = s2
sums_l(k,4,tn) = s3
ENDDO
!$omp parallel do reduction( +: s1, s2, s3, s4) schedule(runtime)
DO i = nxl, nxr
DO j = nys, nyn
s1 = s1 + usws(j,i) * rmask(j,i,sr)
s2 = s2 + vsws(j,i) * rmask(j,i,sr)
s3 = s3 + shf(j,i) * rmask(j,i,sr)
s4 = s4 + 0.0
ENDDO
ENDDO
sums_l(nzb,12,tn) = s1
sums_l(nzb,14,tn) = s2
sums_l(nzb,16,tn) = s3
END SUBROUTINE
答案 0 :(得分:0)
评论中没有更多的地方:
当我在Ivy Bridge CPU上编译它时,我得到了这个。第15行上的循环无法在CPU上进行矢量化,但请注意它是针对Intel MIC架构的VECTORIZED。循环16在CPU上进行矢量化,同时删除了目标指令。
矢量化问题的原因在于第一个注释"下标过于复杂"。
ifort -openmp simd.f90 -warn -O3 -c -vec-report=3 -xHOST -fpp
ifort: command line remark #10382: option '-xHOST' setting '-xCORE-AVX-I'
simd.f90(17): (col. 33) remark: loop was not vectorized: subscript too complex
simd.f90(15): (col. 5) warning #13379: loop was not vectorized with "simd"
simd.f90(16): (col. 8) remark: LOOP WAS VECTORIZED
simd.f90(13): (col. 3) remark: loop was not vectorized: not inner loop
simd.f90(13): (col. 3) remark: loop was not vectorized: not inner loop
simd.f90(31): (col. 4) remark: LOOP WAS VECTORIZED
simd.f90(30): (col. 3) remark: loop was not vectorized: not inner loop
simd.f90(29): (col. 7) remark: loop was not vectorized: not inner loop
simd.f90(29): (col. 7) remark: BLOCK WAS VECTORIZED
ifort: warning #10362: Environment configuration problem encountered. Please check for proper MPSS installation and environment setup.
simd.f90(15): (col. 5) remark: *MIC* OpenMP SIMD LOOP WAS VECTORIZED
simd.f90(13): (col. 3) remark: *MIC* loop was not vectorized: not inner loop
simd.f90(13): (col. 3) remark: *MIC* loop was not vectorized: not inner loop
simd.f90(31): (col. 4) remark: *MIC* LOOP WAS VECTORIZED
simd.f90(31): (col. 4) remark: *MIC* PEEL LOOP WAS VECTORIZED
simd.f90(31): (col. 4) remark: *MIC* REMAINDER LOOP WAS VECTORIZED
simd.f90(30): (col. 3) remark: *MIC* loop was not vectorized: not inner loop
simd.f90(29): (col. 7) remark: *MIC* loop was not vectorized: not inner loop