如何调试“程序接收信号SIGSEGV:分段错误”的错误

时间:2018-08-21 14:20:37

标签: linux fortran gfortran netcdf

我正在运行Fortran exe,但出现错误:

 set_nml_output Echo NML values to log file only
 Trying to open namelist log dart_log.nml
 PE 0: initialize_mpi_utilities:  Running with            8  MPI processes.

Program received signal SIGSEGV: Segmentation fault - invalid memory reference.

Backtrace for this error:

然后我尝试使用gdb查找内容,它会报告

[New LWP 9883]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Failed to read a valid object file image from memory.
Core was generated by `./filter'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0  0x00002af8e021390c in netcdf::nf90_open (
    path=<error reading variable: value requires 57959040 bytes, which is more than max-value-size>, mode=0, 
    ncid=<error reading variable: Cannot access memory at address 0x7ffe439346b0>, 
    chunksize=<error reading variable: Cannot access memory at address 0x0>, 
    cache_size=<error reading variable: Cannot access memory at address 0x7ffe43934530>, 
    cache_nelems=<error reading variable: Cannot access memory at address 0x7ffe43934528>, 
    cache_preemption=<error reading variable: Cannot access memory at address 0x7ffe439345a0>, 
---Type <return> to continue, or q <return> to quit---
    comm=<error reading variable: Cannot access memory at address 0x7ffe439345a8>, 
    info=<error reading variable: Cannot access memory at address 0x7ffe439345b0>, 
    _path=<error reading variable: Cannot access memory at address 0x7ffe439345b8>) at netcdf4_file.f90:39
39  netcdf4_file.f90: No such file or directory.
(gdb) bt
#0  0x00002af8e021390c in netcdf::nf90_open (
    path=<error reading variable: value requires 57959040 bytes, which is more than max-value-size>, mode=0, 
    ncid=<error reading variable: Cannot access memory at address 0x7ffe439346b0>, 
    chunksize=<error reading variable: Cannot access memory at address 0x0>, 
    cache_size=<error reading variable: Cannot access memory at address 0x7ffe43934530>, 
    cache_nelems=<error reading variable: Cannot access memory at address 0x7ffe43934528>, 
    cache_preemption=<error reading variable: Cannot access memory at address 0x7ffe439345a0>, 
    comm=<error reading variable: Cannot access memory at address 0x7ffe439345a8>, 
    info=<error reading variable: Cannot access memory at address 0x7ffe439345b0>, 
    _path=<error reading variable: Cannot access memory at address 0x7ffe439345b8>) at netcdf4_file.f90:39
Backtrace stopped: Cannot access memory at address 0x7ffe43934598

和netcdf4_file.f90:39如下所示:

if (present(cache_size) .or. present(cache_nelems) .or. &
       present(cache_preemption)) then
     ret = nf_get_chunk_cache(size_in, nelems_in, preemption_in)
     if (ret .ne. nf90_noerr) then
        nf90_open = ret
        return
     end if
     if (present(cache_size)) then
        size_out = cache_size     #### line 39
     else
        size_out = size_in
     end if
     if (present(cache_nelems)) then
        nelems_out = cache_nelems
     else
        nelems_out = nelems_in
     end if

netcdf的版本是否与问题有关,或者应该修改某些设置?

谁能给我一些有关解决此问题的建议, 因为我对这些并不陌生。 预先感谢。

1 个答案:

答案 0 :(得分:0)

分段错误确实很难调试,但是我可以做一些事情:

使用调试符号和运行时检查进行编译。这些标志取决于编译器,但以下是gfortran和Intel Fortran的标志:

gfortran     ifort         effect
------------------------------------------------------
-g           -g            Stores the code inside the binary
-O0          -O0           Disables optimisation
-fbacktrace  -traceback    More informative stack trace
-Wall        -warn all     Enable all compile time warnings
-fcheck=all  -check all    Enable run time checks

幸运的是,当您的程序以这种方式编译后崩溃时,更容易推断出问题所在。