使用valgrind memcheck调试MPI Fortran程序时报告错误

时间:2017-07-05 08:36:11

标签: fortran mpi valgrind openmpi memcheck

我正在尝试使用valgrind memcheck调试我在Fortran中编写的MPI程序。但是,即使我调试最小的MPI程序,Memcheck也会在第一个MPI语句MPI_INIT(IERR)上报告错误。 如果我评论与MPI相关的陈述,则不会发生错误。

以下是我的最小工作,你可以帮助解决这个问题。

主文件:testMPI.for

  PROGRAM testMPI
  IMPLICIT NONE
  CALL myMPI_ENV_INIT
  CALL myMPI_ENV_FINISH      
  STOP
  END PROGRAM 

mympi_env_init.for ```

  SUBROUTINE myMPI_ENV_INIT
  USE MPI     
  IMPLICIT NONE
  INTEGER IERR
  CALL MPI_INIT(IERR)
  RETURN
  END 

mympi_env_finish.for

  SUBROUTINE myMPI_ENV_FINISH
  USE MPI
  IMPLICIT NONE
  INTEGER IERR
  CALL MPI_FINALIZE(IERR)
  RETURN
  END

编译命令

$mpif90 *.for 

$mpif90 --version
GNU Fortran (Debian 4.9.2-10) 4.9.2
Copyright (C) 2014 Free Software Foundation, Inc.

GNU Fortran comes with NO WARRANTY, to the extent permitted by law.
You may redistribute copies of GNU Fortran
under the terms of the GNU General Public License.
For more information about these matters, see the file named COPYING

在我收到a.out后,我用valgrind检查它,并报告了很多错误

$ valgrind ./a.out 
==5294== Memcheck, a memory error detector
==5294== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==5294== Using Valgrind-3.13.0 and LibVEX; rerun with -h for copyright info
==5294== Command: ./a.out
==5294== 
==5294== Conditional jump or move depends on uninitialised value(s)
==5294==    at 0x6B8FF82: opal_value_unload (dss_load_unload.c:291)
==5294==    by 0x82A649A: rte_init (ess_singleton_module.c:274)
==5294==    by 0x6895D1A: orte_init (orte_init.c:226)
==5294==    by 0x5517CF9: ompi_mpi_init (ompi_mpi_init.c:505)
==5294==    by 0x55502DB: PMPI_Init (pinit.c:66)
==5294==    by 0x52AD135: MPI_INIT (pinit_f.c:84)
==5294==    by 0x400AC0: mpi_env_init_ (mpi_env_init.for:8)
==5294==    by 0x400AD1: MAIN__ (testMPI.for:7)
==5294==    by 0x400B1E: main (testMPI.for:13)
==5294== 
vex amd64->IR: unhandled instruction bytes: 0xF0 0x48 0xF 0xC7 0xE 0xF 0x94 0xC0 0x88 0x45
vex amd64->IR:   REX=1 REX.W=1 REX.R=0 REX.X=0 REX.B=0
vex amd64->IR:   VEX=0 VEX.L=0 VEX.nVVVV=0x0 ESC=0F
vex amd64->IR:   PFX.66=0 PFX.F2=0 PFX.F3=0
==5294== valgrind: Unrecognised instruction at address 0x6b84d96.
==5294==    at 0x6B84D96: opal_atomic_cmpset_128 (atomic.h:132)
==5294==    by 0x6B84EEE: opal_update_counted_pointer (opal_lifo.h:70)
==5294==    by 0x6B84F63: opal_lifo_pop_atomic (opal_lifo.h:150)
==5294==    by 0x6B85092: opal_free_list_get_st (opal_free_list.h:213)
==5294==    by 0x6B850B4: opal_free_list_get (opal_free_list.h:225)
==5294==    by 0x6B852F4: opal_rb_tree_init (opal_rb_tree.c:86)
==5294==    by 0xAF8FBC6: mca_mpool_hugepage_module_init (mpool_hugepage_module.c:107)
==5294==    by 0xAF90853: mca_mpool_hugepage_open (mpool_hugepage_component.c:166)
==5294==    by 0x6BBA5FF: open_components (mca_base_components_open.c:117)
==5294==    by 0x6BBA51C: mca_base_framework_components_open (mca_base_components_open.c:65)
==5294==    by 0x6C2BCD3: mca_mpool_base_open (mpool_base_frame.c:89)
==5294==    by 0x6BC8C36: mca_base_framework_open (mca_base_framework.c:174)
==5294== Your program just tried to execute an instruction that Valgrind
==5294== did not recognise.  There are two possible reasons for this.
==5294== 1. Your program has a bug and erroneously jumped to a non-code
==5294==    location.  If you are running Memcheck and you just saw a
==5294==    warning about a bad jump, it's probably your program's fault.
==5294== 2. The instruction is legitimate but Valgrind doesn't handle it,
==5294==    i.e. it's Valgrind's fault.  If you think this is the case or
==5294==    you are not sure, please let us know and we'll try to fix it.
==5294== Either way, Valgrind will now raise a SIGILL signal which will
==5294== probably kill your program.

Program received signal SIGILL: Illegal instruction.

Backtrace for this error:
#0  0x585F407
#1  0x585FA1E
#2  0x650A0DF
#3  0x6B84D96
#4  0x6B84EEE
#5  0x6B84F63
#6  0x6B85092
#7  0x6B850B4
#8  0x6B852F4
#9  0xAF8FBC6
#10  0xAF90853
#11  0x6BBA5FF
#12  0x6BBA51C
#13  0x6C2BCD3
#14  0x6BC8C36
#15  0x5517EF5
#16  0x55502DB
#17  0x52AD135
#18  0x400AC0 in mpi_env_init_ at mpi_env_init.for:8
#19  0x400AD1 in testmpi at testMPI.for:7
==5294== 
==5294== Process terminating with default action of signal 4 (SIGILL)
==5294==    at 0x62C775B: raise (pt-raise.c:37)
==5294==    by 0x650A0DF: ??? (in /lib/x86_64-linux-gnu/libc-2.19.so)
==5294==    by 0x6B84D95: opal_atomic_cmpset_128 (atomic.h:132)
==5294==    by 0x6B84EEE: opal_update_counted_pointer (opal_lifo.h:70)
==5294==    by 0x6B84F63: opal_lifo_pop_atomic (opal_lifo.h:150)
==5294==    by 0x6B85092: opal_free_list_get_st (opal_free_list.h:213)
==5294==    by 0x6B850B4: opal_free_list_get (opal_free_list.h:225)
==5294==    by 0x6B852F4: opal_rb_tree_init (opal_rb_tree.c:86)
==5294==    by 0xAF8FBC6: mca_mpool_hugepage_module_init (mpool_hugepage_module.c:107)
==5294==    by 0xAF90853: mca_mpool_hugepage_open (mpool_hugepage_component.c:166)
==5294==    by 0x6BBA5FF: open_components (mca_base_components_open.c:117)
==5294==    by 0x6BBA51C: mca_base_framework_components_open (mca_base_components_open.c:65)
==5294== 
==5294== HEAP SUMMARY:
==5294==     in use at exit: 793,505 bytes in 5,291 blocks
==5294==   total heap usage: 11,507 allocs, 6,216 frees, 1,418,543 bytes allocated
==5294== 
==5294== LEAK SUMMARY:
==5294==    definitely lost: 170 bytes in 3 blocks
==5294==    indirectly lost: 583 bytes in 13 blocks
==5294==      possibly lost: 544 bytes in 2 blocks
==5294==    still reachable: 792,208 bytes in 5,273 blocks
==5294==         suppressed: 0 bytes in 0 blocks
==5294== Rerun with --leak-check=full to see details of leaked memory
==5294== 
==5294== For counts of detected and suppressed errors, rerun with: -v
==5294== Use --track-origins=yes to see where uninitialised values come from
==5294== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 0 from 0)
Illegal instruction

虽然它报告MPI_INIT有一些错误。它认为它必须是我在配置或程序上的错误,而不是OpenMPI。

你能告诉我如何摆脱这个问题吗?

=== valgrind版本     $ valgrind --version     的valgrind-3.13.0

== 我的openmpi信息

$ ompi_info
                 Package: Open MPI hpcc@nkt93 Distribution
                Open MPI: 2.1.1
  Open MPI repo revision: v2.1.0-100-ga2fdb5b
   Open MPI release date: May 10, 2017
                Open RTE: 2.1.1
  Open RTE repo revision: v2.1.0-100-ga2fdb5b
   Open RTE release date: May 10, 2017
                    OPAL: 2.1.1
      OPAL repo revision: v2.1.0-100-ga2fdb5b
       OPAL release date: May 10, 2017
                 MPI API: 3.1.0
            Ident string: 2.1.1
                  Prefix: /home/hpcc/openmpi
 Configured architecture: x86_64-unknown-linux-gnu
          Configure host: nkt93
           Configured by: hpcc
           Configured on: Tue Jul  4 11:48:05 JST 2017
          Configure host: nkt93
                Built by: hpcc
                Built on: 2017年  7月  4日 火曜日 12:02:19 JST
              Built host: nkt93
              C bindings: yes
            C++ bindings: no
             Fort mpif.h: yes (all)
            Fort use mpi: yes (full: ignore TKR)
       Fort use mpi size: deprecated-ompi-info-value
        Fort use mpi_f08: yes
 Fort mpi_f08 compliance: The mpi_f08 module is available, but due to
                          limitations in the gfortran compiler, does not
                          support the following: array subsections, direct
                          passthru (where possible) to underlying Open MPI's
                          C functionality
  Fort mpi_f08 subarrays: no
           Java bindings: no
  Wrapper compiler rpath: runpath
              C compiler: gcc
     C compiler absolute: /usr/bin/gcc
  C compiler family name: GNU
      C compiler version: 4.9.2
            C++ compiler: g++
   C++ compiler absolute: /usr/bin/g++
           Fort compiler: gfortran
       Fort compiler abs: /usr/bin/gfortran
         Fort ignore TKR: yes (!GCC$ ATTRIBUTES NO_ARG_CHECK ::)
   Fort 08 assumed shape: yes
      Fort optional args: yes
          Fort INTERFACE: yes
    Fort ISO_FORTRAN_ENV: yes
       Fort STORAGE_SIZE: yes
      Fort BIND(C) (all): yes
      Fort ISO_C_BINDING: yes
 Fort SUBROUTINE BIND(C): yes
       Fort TYPE,BIND(C): yes
 Fort T,BIND(C,name="a"): yes
            Fort PRIVATE: yes
          Fort PROTECTED: yes
           Fort ABSTRACT: yes
       Fort ASYNCHRONOUS: yes
          Fort PROCEDURE: yes
         Fort USE...ONLY: yes
           Fort C_FUNLOC: yes
 Fort f08 using wrappers: yes
         Fort MPI_SIZEOF: yes
             C profiling: yes
           C++ profiling: no
   Fort mpif.h profiling: yes
  Fort use mpi profiling: yes
   Fort use mpi_f08 prof: yes
          C++ exceptions: no
          Thread support: posix (MPI_THREAD_MULTIPLE: no, OPAL support: yes,
                          OMPI progress: no, ORTE progress: yes, Event lib:
                          yes)
           Sparse Groups: no
  Internal debug support: yes
  MPI interface warnings: yes
     MPI parameter check: runtime
Memory profiling support: no
Memory debugging support: no
              dl support: yes
   Heterogeneous support: no
 mpirun default --prefix: no
         MPI I/O support: yes
       MPI_WTIME support: native
     Symbol vis. support: yes
   Host topology support: yes
          MPI extensions: affinity, cuda
  MPI_MAX_PROCESSOR_NAME: 256
    MPI_MAX_ERROR_STRING: 256
     MPI_MAX_OBJECT_NAME: 64
        MPI_MAX_INFO_KEY: 36
        MPI_MAX_INFO_VAL: 256
       MPI_MAX_PORT_NAME: 1024
  MPI_MAX_DATAREP_STRING: 128
           MCA allocator: bucket (MCA v2.1.0, API v2.0.0, Component v2.1.1)
           MCA allocator: basic (MCA v2.1.0, API v2.0.0, Component v2.1.1)
           MCA backtrace: execinfo (MCA v2.1.0, API v2.0.0, Component v2.1.1)
                 MCA btl: tcp (MCA v2.1.0, API v3.0.0, Component v2.1.1)
                 MCA btl: vader (MCA v2.1.0, API v3.0.0, Component v2.1.1)
                 MCA btl: self (MCA v2.1.0, API v3.0.0, Component v2.1.1)
                 MCA btl: sm (MCA v2.1.0, API v3.0.0, Component v2.1.1)
                  MCA dl: dlopen (MCA v2.1.0, API v1.0.0, Component v2.1.1)
               MCA event: libevent2022 (MCA v2.1.0, API v2.0.0, Component
                          v2.1.1)
               MCA hwloc: hwloc1112 (MCA v2.1.0, API v2.0.0, Component
                          v2.1.1)
                  MCA if: posix_ipv4 (MCA v2.1.0, API v2.0.0, Component
                          v2.1.1)
                  MCA if: linux_ipv6 (MCA v2.1.0, API v2.0.0, Component
                          v2.1.1)
         MCA installdirs: env (MCA v2.1.0, API v2.0.0, Component v2.1.1)
         MCA installdirs: config (MCA v2.1.0, API v2.0.0, Component v2.1.1)
          MCA memchecker: valgrind (MCA v2.1.0, API v2.0.0, Component v2.1.1)
              MCA memory: patcher (MCA v2.1.0, API v2.0.0, Component v2.1.1)
               MCA mpool: hugepage (MCA v2.1.0, API v3.0.0, Component v2.1.1)
             MCA patcher: overwrite (MCA v2.1.0, API v1.0.0, Component
                          v2.1.1)
                MCA pmix: pmix112 (MCA v2.1.0, API v2.0.0, Component v2.1.1)
               MCA pstat: linux (MCA v2.1.0, API v2.0.0, Component v2.1.1)
              MCA rcache: grdma (MCA v2.1.0, API v3.3.0, Component v2.1.1)
                 MCA sec: basic (MCA v2.1.0, API v1.0.0, Component v2.1.1)
               MCA shmem: sysv (MCA v2.1.0, API v2.0.0, Component v2.1.1)
               MCA shmem: mmap (MCA v2.1.0, API v2.0.0, Component v2.1.1)
               MCA shmem: posix (MCA v2.1.0, API v2.0.0, Component v2.1.1)
               MCA timer: linux (MCA v2.1.0, API v2.0.0, Component v2.1.1)
                 MCA dfs: app (MCA v2.1.0, API v1.0.0, Component v2.1.1)
                 MCA dfs: test (MCA v2.1.0, API v1.0.0, Component v2.1.1)
                 MCA dfs: orted (MCA v2.1.0, API v1.0.0, Component v2.1.1)
              MCA errmgr: default_tool (MCA v2.1.0, API v3.0.0, Component
                          v2.1.1)
              MCA errmgr: default_app (MCA v2.1.0, API v3.0.0, Component
                          v2.1.1)
              MCA errmgr: default_orted (MCA v2.1.0, API v3.0.0, Component
                          v2.1.1)
              MCA errmgr: default_hnp (MCA v2.1.0, API v3.0.0, Component
                          v2.1.1)
                 MCA ess: slurm (MCA v2.1.0, API v3.0.0, Component v2.1.1)
                 MCA ess: tool (MCA v2.1.0, API v3.0.0, Component v2.1.1)
                 MCA ess: singleton (MCA v2.1.0, API v3.0.0, Component
                          v2.1.1)
                 MCA ess: hnp (MCA v2.1.0, API v3.0.0, Component v2.1.1)
                 MCA ess: env (MCA v2.1.0, API v3.0.0, Component v2.1.1)
                 MCA ess: pmi (MCA v2.1.0, API v3.0.0, Component v2.1.1)
               MCA filem: raw (MCA v2.1.0, API v2.0.0, Component v2.1.1)
             MCA grpcomm: direct (MCA v2.1.0, API v3.0.0, Component v2.1.1)
                 MCA iof: mr_orted (MCA v2.1.0, API v2.0.0, Component v2.1.1)
                 MCA iof: hnp (MCA v2.1.0, API v2.0.0, Component v2.1.1)
                 MCA iof: tool (MCA v2.1.0, API v2.0.0, Component v2.1.1)
                 MCA iof: orted (MCA v2.1.0, API v2.0.0, Component v2.1.1)
                 MCA iof: mr_hnp (MCA v2.1.0, API v2.0.0, Component v2.1.1)
            MCA notifier: syslog (MCA v2.1.0, API v1.0.0, Component v2.1.1)
                MCA odls: default (MCA v2.1.0, API v2.0.0, Component v2.1.1)
                 MCA oob: usock (MCA v2.1.0, API v2.0.0, Component v2.1.1)
                 MCA oob: tcp (MCA v2.1.0, API v2.0.0, Component v2.1.1)
                 MCA plm: rsh (MCA v2.1.0, API v2.0.0, Component v2.1.1)
                 MCA plm: isolated (MCA v2.1.0, API v2.0.0, Component v2.1.1)
                 MCA plm: slurm (MCA v2.1.0, API v2.0.0, Component v2.1.1)
                 MCA ras: loadleveler (MCA v2.1.0, API v2.0.0, Component
                          v2.1.1)
                 MCA ras: slurm (MCA v2.1.0, API v2.0.0, Component v2.1.1)
                 MCA ras: simulator (MCA v2.1.0, API v2.0.0, Component
                          v2.1.1)
               MCA rmaps: ppr (MCA v2.1.0, API v2.0.0, Component v2.1.1)
               MCA rmaps: staged (MCA v2.1.0, API v2.0.0, Component v2.1.1)
               MCA rmaps: resilient (MCA v2.1.0, API v2.0.0, Component
                          v2.1.1)
               MCA rmaps: mindist (MCA v2.1.0, API v2.0.0, Component v2.1.1)
               MCA rmaps: seq (MCA v2.1.0, API v2.0.0, Component v2.1.1)
               MCA rmaps: rank_file (MCA v2.1.0, API v2.0.0, Component
                          v2.1.1)
               MCA rmaps: round_robin (MCA v2.1.0, API v2.0.0, Component
                          v2.1.1)
                 MCA rml: oob (MCA v2.1.0, API v2.0.0, Component v2.1.1)
              MCA routed: debruijn (MCA v2.1.0, API v2.0.0, Component v2.1.1)
              MCA routed: direct (MCA v2.1.0, API v2.0.0, Component v2.1.1)
              MCA routed: binomial (MCA v2.1.0, API v2.0.0, Component v2.1.1)
              MCA routed: radix (MCA v2.1.0, API v2.0.0, Component v2.1.1)
                 MCA rtc: hwloc (MCA v2.1.0, API v1.0.0, Component v2.1.1)
                 MCA rtc: freq (MCA v2.1.0, API v1.0.0, Component v2.1.1)
              MCA schizo: ompi (MCA v2.1.0, API v1.0.0, Component v2.1.1)
               MCA state: novm (MCA v2.1.0, API v1.0.0, Component v2.1.1)
               MCA state: hnp (MCA v2.1.0, API v1.0.0, Component v2.1.1)
               MCA state: orted (MCA v2.1.0, API v1.0.0, Component v2.1.1)
               MCA state: staged_hnp (MCA v2.1.0, API v1.0.0, Component
                          v2.1.1)
               MCA state: app (MCA v2.1.0, API v1.0.0, Component v2.1.1)
               MCA state: dvm (MCA v2.1.0, API v1.0.0, Component v2.1.1)
               MCA state: tool (MCA v2.1.0, API v1.0.0, Component v2.1.1)
               MCA state: staged_orted (MCA v2.1.0, API v1.0.0, Component
                          v2.1.1)
                 MCA bml: r2 (MCA v2.1.0, API v2.0.0, Component v2.1.1)
                MCA coll: self (MCA v2.1.0, API v2.0.0, Component v2.1.1)
                MCA coll: tuned (MCA v2.1.0, API v2.0.0, Component v2.1.1)
                MCA coll: libnbc (MCA v2.1.0, API v2.0.0, Component v2.1.1)
                MCA coll: inter (MCA v2.1.0, API v2.0.0, Component v2.1.1)
                MCA coll: sync (MCA v2.1.0, API v2.0.0, Component v2.1.1)
                MCA coll: basic (MCA v2.1.0, API v2.0.0, Component v2.1.1)
                MCA coll: sm (MCA v2.1.0, API v2.0.0, Component v2.1.1)
                MCA fbtl: posix (MCA v2.1.0, API v2.0.0, Component v2.1.1)
               MCA fcoll: two_phase (MCA v2.1.0, API v2.0.0, Component
                          v2.1.1)
               MCA fcoll: dynamic_gen2 (MCA v2.1.0, API v2.0.0, Component
                          v2.1.1)
               MCA fcoll: static (MCA v2.1.0, API v2.0.0, Component v2.1.1)
               MCA fcoll: dynamic (MCA v2.1.0, API v2.0.0, Component v2.1.1)
               MCA fcoll: individual (MCA v2.1.0, API v2.0.0, Component
                          v2.1.1)
                  MCA fs: ufs (MCA v2.1.0, API v2.0.0, Component v2.1.1)
                  MCA io: romio314 (MCA v2.1.0, API v2.0.0, Component v2.1.1)
                  MCA io: ompio (MCA v2.1.0, API v2.0.0, Component v2.1.1)
                 MCA osc: sm (MCA v2.1.0, API v3.0.0, Component v2.1.1)
                 MCA osc: rdma (MCA v2.1.0, API v3.0.0, Component v2.1.1)
                 MCA osc: pt2pt (MCA v2.1.0, API v3.0.0, Component v2.1.1)
                 MCA pml: v (MCA v2.1.0, API v2.0.0, Component v2.1.1)
                 MCA pml: ob1 (MCA v2.1.0, API v2.0.0, Component v2.1.1)
                 MCA pml: cm (MCA v2.1.0, API v2.0.0, Component v2.1.1)
                 MCA rte: orte (MCA v2.1.0, API v2.0.0, Component v2.1.1)
            MCA sharedfp: individual (MCA v2.1.0, API v2.0.0, Component
                          v2.1.1)
            MCA sharedfp: lockedfile (MCA v2.1.0, API v2.0.0, Component
                          v2.1.1)
            MCA sharedfp: sm (MCA v2.1.0, API v2.0.0, Component v2.1.1)
                MCA topo: basic (MCA v2.1.0, API v2.2.0, Component v2.1.1)
           MCA vprotocol: pessimist (MCA v2.1.0, API v2.0.0, Component
                          v2.1.1)

0 个答案:

没有答案