使用 f2py 和 intel 编译器将 MKL 与 fortran 链接

时间:2021-04-05 15:31:30

标签: python fortran intel-fortran intel-mkl f2py

我有一些 Fortran 代码,其中包含对 Lapack 和 Blas 函数的调用,我正在尝试使用 f2py 编译这些函数:

  • Windows 10
  • Numpy 1.19 版
  • 蟒蛇 Python 3.8.5

我正在尝试编译的代码的干净版本可以在链接中找到:https://software.intel.com/content/dam/develop/external/us/en/documents/mkl-cookbook -samples-120115.zip 位于 /BlockTDS_SPD/source/ 下。

可以在此链接中找到此函数的说明:https://software.intel.com/content/www/us/en/develop/documentation/mkl-cookbook/top/factoring-block-tridiagonal-symmetric-positive-definite-matrices.html

这是我尝试使用添加的 Cf2py 意图语句编译的代码:

!***********************************************************************
! Copyright(C) 2014-2015 Intel Corporation. All Rights Reserved.
!
! The source code, information  and  material ("Material") contained herein is
! owned  by Intel Corporation or its suppliers or licensors, and title to such
! Material remains  with Intel Corporation  or its suppliers or licensors. The
! Material  contains proprietary information  of  Intel or  its  suppliers and
! licensors. The  Material is protected by worldwide copyright laws and treaty
! provisions. No  part  of  the  Material  may  be  used,  copied, reproduced,
! modified, published, uploaded, posted, transmitted, distributed or disclosed
! in any way  without Intel's  prior  express written  permission. No  license
! under  any patent, copyright  or  other intellectual property rights  in the
! Material  is  granted  to  or  conferred  upon  you,  either  expressly,  by
! implication, inducement,  estoppel or  otherwise.  Any  license  under  such
! intellectual  property  rights must  be express  and  approved  by  Intel in
! writing.
! 
! *Third Party trademarks are the property of their respective owners.
! 
! Unless otherwise  agreed  by Intel  in writing, you may not remove  or alter
! this  notice or  any other notice embedded  in Materials by Intel or Intel's
! suppliers or licensors in any way.
!
!***********************************************************************
!  Content:
!      Subroutine DPBLTRF for Cholesky factorization of symmetric  
!         positive definite block tridiagonal matrix.
!***********************************************************************
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
! Definition:
! ===========
!      SUBROUTINE DPBLTRF(N, NB, D, LDD, B, LDB, INFO)
!
! ..Scalar arguments..
!      INTEGER N, NB, LDD, LDB, INFO
! ..       
! ..Array arguments..
!      REAL*8 D(LDD,*), B(LDB,*)
! ..
! Purpose:
! ========  
! DPBLTRF computes Cholesky L*L^t-factorization of symmetric positive 
! definite block tridiagonal matrix A
!   D_1  B_1^t
!   B_1  D_2   B_2^t
!        B_2  D_3   B_3^t
!           .     .      .
!               .     .      .
!                 B_N-2  D_N-1   B_N-1^t
!                        B_N-1    D_N   
! The factorization has the form A = L*L**t, where L is a lower 
! bidiagonal block matrix 
!   L_1  
!   C_1  L_2   
!        C_2   L_3 
!           .     .      .
!               .     .      .
!                 C_N-2  L_N-1
!                        C_N-1    L_N   
! This is a block version of LAPACK DPTTRF subroutine.
!
! Arguments:
! ==========  
! N (input) INTEGER
!     The number of block rows of the matrix A.  N >= 0.
!
! NB (input) INTEGER
!     The size of blocks.  NB >= 0.
!
! D (input/output) REAL*8 array, dimension (LDD,N*NB)
!     On entry, the array stores N diagonal blocks (each of size NB by  
!         NB) of the matrix to be factored. The blocks are stored 
!         sequentially: first NB columns of D store block D_1, second NB 
!         columns store block D_2,...,last NB columns store block D_N.
!     Note: As the diagonal blocks are symmetric only lower or upper 
!     ====
!         triangle is needed to store blocks' elements. In this code 
!         lower storage is used!!!
!     On exit, the array stores diagonal blocks of triangular factor L. 
!         Diagonal blocks of lower triangular factor L replace
!         respective lower triangles of blocks D_j (1 <= j <= N). 
!     Caution: upper triangles of diagonal blocks are not zeroed on 
!     =======
!         exit!!!
!
! LDD (input) INTEGER
!     The leading dimension of array D. LDD >= NB.
!
! B (input/output) REAL*8 array, dimension (LDB,(N-1)*NB)
!     On entry, the array stores sub-diagonal  blocks (each of size NB
!         by NB) of the matrix to be factored. The blocks are stored 
!         sequentially: first NB columns of B store block B_1, second  
!         NB columns store block B_2,...,last NB columns store block 
!         B_N-1.
!     On exit, the array stores sub-diagonal blocks of triangular factor 
!         L.  
!
! LDB (input) INTEGER
!     The leading dimension of array B. LDB >= NB.
!
! INFO (output) INTEGER
!     = 0:        successful exit
!     < 0:        if INFO = -i, the i-th argument had an illegal value
!     > 0:        if INFO = i, the leading minor of order i (and 
!                 therefore the matrix A itself) is not 
!                 positive-definite, and the factorization could not be
!                 completed. This may indicate an error in forming the 
!                 matrix A.
! =====================================================================

      SUBROUTINE DPBLTRF(N, NB, D, LDD, B, LDB, INFO)
      IMPLICIT NONE
! ..Scalar arguments..
      INTEGER N, NB, LDD, LDB, INFO
! ..Array arguments..
      REAL*8 D(LDD,*), B(LDB,*)
! =====================================================================
! .. Local Scalars ..
      INTEGER K
      
Cf2py integer, intent(in) N, NB, LDD, LDB
Cf2py intent(in,out,copy) :: D, B
Cf2py integer, intent(out) :: INFO

! ..
! .. Executable Statements ..
! ..
!    Test the input arguments.
      INFO = 0
      IF(N .LT. 0) THEN
          INFO = -1
      ELSE IF(NB .LT. 0) THEN
          INFO = -2
      ELSE IF(LDD .LT. NB) THEN
          INFO = -4
      ELSE IF(LDB .LT. NB) THEN
          INFO = -6
      END IF
      IF(INFO .NE. 0) THEN
          RETURN
      END IF
! ..
! Compute Cholesky factorization of the first diagonal block
      CALL DPOTRF('L', NB, D, LDD, INFO)
      IF(INFO .NE. 0) THEN
          RETURN
      END IF
!
! Main loop
      DO K = 1, N-1
          CALL DTRSM('R', 'L', 'T', 'N', NB, NB, 1D0, 
     &                D(1,(K-1)*NB+1), LDD, B(1,(K-1)*NB+1), LDB)
          CALL DSYRK('L', 'N', NB, NB, -1D0, 
     &               B(1,(K-1)*NB+1), LDB, 1D0, D(1,K*NB+1), LDD)
          CALL DPOTRF('L', NB, D(1,K*NB+1), LDD, INFO)
          IF(INFO .NE. 0) THEN
              INFO = INFO + K*NB
! INFO is equal to not local but global number of the row              
              RETURN
          END IF
      END DO
      RETURN
      END
      
      

我使用以下命令将代码链接到 MKL 并使用 Intel Visual Fortran 64 位编译器进行编译:

python -m numpy.f2py -c --fcompiler=intelvem -L"C:\Program Files (x86)\Intel\oneAPI\compiler\2021.2.0\windows\compiler\lib\intel64_win" -lifconsol -L"C:\Program Files (x86)\Intel\oneAPI\mkl\2021.2.0\lib\intel64" -lmkl_intel_ilp64 -L"C:\Program Files (x86)\Intel\oneAPI\mkl\2021.2.0\lib\intel64" -lmkl_sequential -L"C:\Program Files (x86)\Intel\oneAPI\mkl\2021.2.0\lib\intel64" -lmkl_core -I"C:\Program Files (x86)\Intel\oneAPI\mkl\2021.2.0\include" dpbltrf.f -m SBTCF

结果函数的签名看起来没问题:

d,b,info = dpbltrf(n,nb,d,b,[ldd,ldb,overwrite_d,overwrite_b])

Wrapper for ``dpbltrf``.

Parameters
----------
n : input int
nb : input int
d : input rank-2 array('d') with bounds (ldd,*)
b : input rank-2 array('d') with bounds (ldb,*)

Other Parameters
----------------
overwrite_d : input int, optional
    Default: 0
ldd : input int, optional
    Default: shape(d,0)
overwrite_b : input int, optional
    Default: 0
ldb : input int, optional
    Default: shape(b,0)

Returns
-------
d : rank-2 array('d') with bounds (ldd,*)
b : rank-2 array('d') with bounds (ldb,*)
info : int

然后我可以在 Python 中导入创建的模块并调用它。这是一个最小的例子:

import numpy as np
import matplotlib.pyplot as plt
from SBTCF import dpbltrf

print(dpbltrf.__doc__)

def mx(x, a):
    x = np.ravel(x)
    npts = len(x)
    distances = np.abs(x[:, None] - x[None, :])
    mat = np.exp(-distances / a)
    return mat

# Assemble arrays
Nx = 5
Nt = 10
t = np.linspace(0, 1, Nt)
mat_t = mx(t, 1.0)

# Stack the arrays horizontally into the shape required by dpbltrf. D is diagonal
# and B is off-diagonal
D = mat_t
for i in range(Nx-1):
    D = np.hstack([D, mat_t])

B = mat_t
for i in range(Nx-2):
    B = np.hstack([B, mat_t])

#plt.imshow(D)
#plt.show()

#plt.imshow(B)
#plt.show()

d, b, info = dpbltrf(Nx, Nt, D, B)

当我尝试运行脚本时出现错误:

<块引用>

英特尔 MKL 错误:进入 DPOTRF 时参数 4 不正确

我确信我的参数是正确的,我怀疑问题是由错误的编译选项引起的。有人能告诉我编译命令中是否有可能导致这种情况的错误吗?或者会不会有其他问题导致了这个错误?

1 个答案:

答案 0 :(得分:0)

您正在尝试在 Python 模块中包含 Fortran 模块。如果你想要,名字必须不同,参考这个问题

Compile Fortran module with f2py