Question

我最近将一些慢速python代码转换为C扩展名。它运行得很漂亮，除了它在第162次调用时产生一个段错误，就在返回语句中。

这是它的工作原理。一次，在我想要计算的函数的所有调用之前，我将数据加载到内存中（记住INCREF父对象）：

static PyObject *grm_loadDosage(PyObject *self, PyObject *args) {
 /***
 Load the dosage matrix into global memory
 Global variables: DOSAGES_PYMAT - will be a pointer to the PyArrayObject of the dosage array (must incref this)
                   DOSAGES - will be a pointer to the double array underlying DOSAGES_PYMAT
 ***/
 PyArrayObject *dosageMatrixArr;
 if ( ! PyArg_ParseTuple(args,"O",&DOSAGES_PYMAT_OBJ) ) return NULL;
 if ( NULL == DOSAGES_PYMAT_OBJ ) return NULL;

 Py_INCREF(DOSAGES_PYMAT_OBJ);

 /* from PyObject to Python Array */
 dosageMatrixArr = pymatrix(DOSAGES_PYMAT_OBJ);
 /* get the row and col sizes */
 N_VARIANTS = dosageMatrixArr->dimensions[0];
 N_SAMPLES = dosageMatrixArr->dimensions[1];
 DOSAGES = pymatrix_to_Carray(dosageMatrixArr);
 Py_RETURN_TRUE;
}

（有关C阵列方法，请参阅http://www.scipy.org/Cookbook/C_Extensions/NumPy_arrays）。然后我引用加载的double [] []，在我将从python调用的函数中的DOSAGES：

static PyObject *grm_calcdistance(PyObject *self, PyObject *args) {
 /** Given column indeces (samples) of DOSAGES, and an array of row indeces (the variants missing for one or both),
     calculate the distance **/
 int samI,samJ,nMissing, *missing;
 PyObject *missingObj;
 PyArrayObject *missingArr;
 printf("debug1\n");
 if ( ! PyArg_ParseTuple(args,"iiOi",&samI,&samJ,&missingObj,&nMissing) ) return NULL;
 if ( NULL == missingObj ) return NULL;
 missingArr = pyvector(missingObj);
 missing = pyvector_to_Carray(missingArr);
 double replaced1[nMissing];
 double replaced2[nMissing];
 printf("debug2\n");

 int missingVectorIdx;
 int missingVariantIdx;
 // for each sample, store the dosage at the missing site (as it could be missing
 // in the OTHER sample), and replace it with 0.0 in the dosage matrix
 for ( missingVectorIdx = 0; missingVectorIdx < nMissing; missingVectorIdx++ ) {
  printf("debugA: %d < %d\n",missingVectorIdx,nMissing);
  missingVariantIdx = missing[missingVectorIdx];
  replaced1[missingVariantIdx] = DOSAGES[missingVariantIdx][samI];
  replaced2[missingVariantIdx] = DOSAGES[missingVariantIdx][samJ];
  printf("debugB\n");
  DOSAGES[missingVariantIdx][samI]=0.0;
  DOSAGES[missingVariantIdx][samJ]=0.0;
 }

 // calculate the distance (this uses DOSAGES which we just modified)
 double distance = _calcDistance(samI,samJ);

 printf("debug3\n");
 // put the values that we replaced with 0.0 back into the matrix
 for ( missingVectorIdx = 0; missingVectorIdx < nMissing; missingVectorIdx++ ) {
  missingVariantIdx = missing[missingVectorIdx];
  DOSAGES[missingVariantIdx][samI] = replaced1[missingVariantIdx];
  DOSAGES[missingVariantIdx][samJ] = replaced2[missingVariantIdx];
 }
 printf("debug4: %f\n",distance);
 // grab the python object wrapper and return it
 PyObject * distPy = PyFloat_FromDouble((double)distance);
 printf("debug5\n");
 if ( NULL == distPy )
  printf("and is NULL\n");
 return distPy;

}

有大量的调试语句（你看到），我已经将segfault本地化为return语句。也就是说，在python float对象的实例化之后，但在调用C的返回和下一个执行的python行之间（你猜对了，打印（“debugReturned”））。我在stdout中看到的是：

debug4: -0.025160
debug5
Segmentation fault

所以double不是一个奇怪的值，python对象是正确创建的，它不是NULL，但是从C返回和继续python之间有一些错误。在线资料显示这可能是INCREF / DECREF问题，但也说PyFloat_FromDouble（）和Py_BuildValue（“f”，double）生成新引用，因此不需要INCREF。两种选择都会产生相同的结果。虽然我有理由确定在grm_loadDosage函数期间我需要INCREF保存我的矩阵的PyObject，但是我尝试使用和不使用INCREF，具有相同的行为。

任何想法发生了什么？

由于

此外，还有一个堆栈跟踪：

#0  0x0000000000000000 in ?? ()
#1  0x000000000045aa5c in PyEval_EvalFrameEx (f=0x2aaae1ae3f60, throwflag=<value optimized out>) at Python/ceval.c:2515
#2  0x000000000045ecb4 in call_function (f=0x3fb7494970227c55, throwflag=<value optimized out>) at Python/ceval.c:4009
#3  PyEval_EvalFrameEx (f=0x3fb7494970227c55, throwflag=<value optimized out>) at Python/ceval.c:2692
#4  0x000000000045ecb4 in call_function (f=0x95c880, throwflag=<value optimized out>) at Python/ceval.c:4009
#5  PyEval_EvalFrameEx (f=0x95c880, throwflag=<value optimized out>) at Python/ceval.c:2692
#6  0x000000000045f626 in PyEval_EvalCodeEx (_co=0x98abe0, globals=<value optimized out>, locals=<value optimized out>, args=0x0, argcount=0, kws=0x0, kwcount=0, defs=0x0, defcount=0, kwdefs=0x0,
    closure=0x0) at Python/ceval.c:3350
#7  0x000000000045f74b in PyEval_EvalCode (co=0x146b098, globals=0x71, locals=0xc7) at Python/ceval.c:767
#8  0x0000000000482fab in run_mod (fp=0x881b80, filename=0x2aaaae257de0 "/humgen/gsa-hphome1/chartl/projects/t2d/gcta/resources/bin/cGRM/calculateGRM.py", start=<value optimized out>,
    globals=0x81e340, locals=0x81e340, closeit=1, flags=0x7fffffffbfd0) at Python/pythonrun.c:1783
#9  PyRun_FileExFlags (fp=0x881b80, filename=0x2aaaae257de0 "/humgen/gsa-hphome1/chartl/projects/t2d/gcta/resources/bin/cGRM/calculateGRM.py", start=<value optimized out>, globals=0x81e340,
    locals=0x81e340, closeit=1, flags=0x7fffffffbfd0) at Python/pythonrun.c:1740
#10 0x0000000000483268 in PyRun_SimpleFileExFlags (fp=<value optimized out>, filename=0x2aaaae257de0 "/humgen/gsa-hphome1/chartl/projects/t2d/gcta/resources/bin/cGRM/calculateGRM.py", closeit=1,
    flags=0x7fffffffbfd0) at Python/pythonrun.c:1265
#11 0x00000000004964d7 in run_file (argc=<value optimized out>, argv=0x7df010) at Modules/main.c:297
#12 Py_Main (argc=<value optimized out>, argv=0x7df010) at Modules/main.c:692
#13 0x000000000041563e in main (argc=11, argv=0x7fffffffc148) at ./Modules/python.c:59

Answer 1

我建议尝试对你的代码运行valgrind，参见 How can I use valgrind with Python C++ extensions? 为了使用python如何做到这一点，我不确定exeoption列表对于python 3有多么有用。无论如何，忽略来自零件的所有输出，而不包含任何文件。

如果您使用的是Windows，我会推荐其中一种 Is there a good Valgrind substitute for Windows?

您所做的调试告诉我，函数返回后会发生错误。我不明白为什么退货声明本身应该有问题。根据你的stacktrace的错误来源在于python-3本身的一些代码。我假设python-3本身没有错误。您可以尝试安装不同版本的python-3以进一步排除这种情况。这就是为什么我假设你纠正了堆栈或堆，这就是valgrind派上用场的原因。

Python C扩展：返回PyFloat_FromDouble（double）段错误

1 个答案: