Question

2017/06/13编辑：我按照建议尝试使用boost，但是在花了超过3天试图让它进行编译和链接，并且失败后，我认为愚蠢的痛苦方式可能是最快且不那么痛苦....所以现在我的代码只是保存了C ++随后读取的一堆巨大的文本文件（拆分数组和文件中数字的复杂/虚部）。优雅......不......有效......是的。

我有一些科学代码，目前用Python编写，在循环中通过数字3d集成步骤减慢了速度。为了克服这个问题，我在C ++中重写了这一特定步骤。（Cython等不是一个选项）。

长话短说：我想尽可能方便，轻松地将几个非常大的复数数组从python代码传递给C ++集成器。我可以手动和痛苦地使用文本或二进制文件来做 - 但在我开始之前，我想知道我是否有更好的选择？

我使用Visual Studio for C ++和anaconda for python（不是我的选择！）

是否有任何文件格式或方法可以快速方便地从python中保存一组复数，然后在C ++中重新创建它？

非常感谢，本

Answer 1

我多次使用的简单解决方案是构建您的＆＃34; C ++方面＆＃34;作为一个DLL（= Linux / OS X上的共享对象），提供一个简单的， C-like 入口点（直整数，指针＆amp; co。，没有STL东西）并通过{{传递数据1}}。

这可以避免boost / SIP / Swig / ...构建噩梦，可以保持零拷贝（使用ctypes可以直接指向你的numpy数据）并允许你做任何你想做的事情（特别是在构建时）在C ++方面，没有friggin＆＃39; distutils，没有提升，没有任何东西 - 用任何可以构建类似C的dll构建它。它还具有使用其他语言调用C ++算法的良好副作用（实际上任何语言都有某种方式可以与C库进行交互）。

Here是一个快速的人为例子。 C ++方面只是：

ctypes

这必须编译为dll（在Windows上）或extern "C" { double sum_it(double *array, int size) { double ret = 0.; for(int i=0; i<size; ++i) { ret += array[i]; } return ret; } }（在Linux上），确保导出.so函数（使用gcc自动，需要sum_it文件用VC ++）。

在Python方面，我们可以有一个像

这样的包装器

.def

确保数据正确封送;然后调用该函数与

一样简单

import ctypes
import os
import sys
import numpy as np

path = os.path.dirname(__file__)
cdll = ctypes.CDLL(os.path.join(path, "summer.dll" if sys.platform.startswith("win") else "summer.so"))
_sum_it = cdll.sum_it
_sum_it.restype = ctypes.c_double

def sum_it(l):
    if isinstance(l, np.ndarray) and l.dtype == np.float64 and len(l.shape)==1:
        # it's already a numpy array with the right features - go zero-copy
        a = l.ctypes.data
    else:
        # it's a list or something else - try to create a copy
        arr_t = ctypes.c_double * len(l)
        a = arr_t(*l)
    return _sum_it(a, len(l))

有关如何使用它的更多信息，请参阅ctypes documentation。另请参阅the relevant documentation in numpy。

对于复杂数字，情况稍微复杂一些，因为在ctypes中没有内置它;如果我们想在C ++端使用import summer import numpy as np # from a list (with copy) print summer.sum_it([1, 2, 3, 4.5]) # from a numpy array of the right type - zero-copy print summer.sum_it(np.array([3., 4., 5.]))（pretty much guaranteed可以使用numpy复杂布局，即两个双精度序列），我们可以将C ++端编写为：

std::complex<double>

然后，在Python方面，我们必须复制extern "C" { std::complex<double> sum_it_cplx(std::complex<double> *array, int size) { std::complex<double> ret(0., 0.); for(int i=0; i<size; ++i) { ret += array[i]; } return ret; } }布局以检索返回值（或者能够构建没有numpy的复杂数组）：

c_complex

继承class c_complex(ctypes.Structure): # Complex number, compatible with std::complex layout _fields_ = [("real", ctypes.c_double), ("imag", ctypes.c_double)] def __init__(self, pycomplex): # Init from Python complex self.real = pycomplex.real self.imag = pycomplex.imag def to_complex(self): # Convert to Python complex return self.real + (1.j) * self.imag启用ctypes编组魔法，这是根据ctypes.Structure成员执行的;构造函数和额外的方法只是为了便于在Python端使用。

然后，我们必须告诉ctypes返回类型

_fields_

最后以与前一个类似的方式编写我们的包装器：

_sum_it_cplx = cdll.sum_it_cplx
_sum_it_cplx.restype = c_complex

按上述方式测试

def sum_it_cplx(l):
    if isinstance(l, np.ndarray) and l.dtype == np.complex and len(l.shape)==1:
        # the numpy array layout for complexes (sequence of two double) is already
        # compatible with std::complex (see https://stackoverflow.com/a/5020268/214671)
        a = l.ctypes.data
    else:
        # otherwise, try to build our c_complex
        arr_t = c_complex * len(l)
        a = arr_t(*(c_complex(r) for r in l))
    ret = _sum_it_cplx(a, len(l))
    return ret.to_complex()

产生预期结果：

# from a complex list (with copy)
print summer.sum_it_cplx([1. + 0.j, 0 + 1.j, 2 + 2.j])
# from a numpy array of the right type - zero-copy
print summer.sum_it_cplx(np.array([1. + 0.j, 0 + 1.j, 2 + 2.j]))

Answer 2

注意在编辑中添加。 正如评论中所提到的，作为解释语言的python本身几乎没有计算效率的潜力。因此，为了使python脚本高效，必须使用不能全部解释的模块，但是在头脑中调用用C / C ++编写的编译（和优化）代码。这正是numpy为您所做的事情，特别是对整个数组的操作。

因此，迈向高效python脚本的第一步是使用numpy。只有第二步是尝试使用您自己编译（和优化）的代码。因此，我在下面的示例中假设您使用numpy来存储复数数组。其他一切都是不明智的。

您可以通过多种方式从C / C ++程序中访问python的原始数据。我个人已经使用boost.Python完成了这项工作，但必须警告你，文档和支持充其量只是糟糕的：你自己很多（当然还有堆栈溢出）。

例如，您的C ++文件可能如下所示

// file.cc
#include <boost/python.hpp>
#include <boost/python/numpy.hpp>

namespace p = boost::python;
namespace n = p::numpy;

n::ndarray func(const n::ndarray&input, double control_variable)
{
  /* 
     your code here, see documentation for boost python
     you pass almost any python variable, doesn't have to be numpy stuff
  */
}

BOOST_PYTHON_MODULE(module_name)
{
  Py_Initialize();
  n::initialize();   // only needed if you use numpy in the interface
  p::def("function", func, "doc-string");
}

要编译它，你可以使用python脚本，如

# setup.py

from distutils.core import setup
from distutils.extension import Extension

module_name = Extension(
    'module_name',
    extra_compile_args=['-std=c++11','-stdlib=libc++','-I/some/path/','-march=native'],
    extra_link_args=['-stdlib=libc++'],
    sources=['file.cc'],
    libraries=['boost_python','boost_numpy'])

setup(
    name='module_name',
    version='0.1',
    ext_modules=[module_name])

并将其作为python setup.py build运行，这将在.so的子目录中创建一个适当的build文件，您可以从python中导入该文件。

Answer 3

我看到OP已经使用了一年多，但是最近我使用本机Python-C / C ++ API及其Numpy-C / C ++扩展解决了类似的问题，并且由于我个人不喜欢将ctypes用于出于各种原因（例如，复杂的数字变通办法，凌乱的代码）或Boost，都希望将我的答案发布给以后的搜索者。

Python-C API和Numpy-C API的文档都非常丰富（尽管一开始有点让人不知所措）。但是在一两次成功之后，编写本机C / C ++扩展变得非常容易。

这是一个可以从Python调用的示例C ++函数。它集成了实型或复杂（numpy.double或numpy.cdouble）类型的3D numpy数组。该功能将通过模块.so通过DLL（cintegrate.so）导入。

#include "Python.h"
#include "numpy/arrayobject.h"
#include <math.h>

static PyObject * integrate3(PyObject * module, PyObject * args)
{
    PyObject * argy=NULL;        // Regular Python/C API
    PyArrayObject * yarr=NULL;   // Extended Numpy/C API
    double dx,dy,dz;

    // "O" format -> read argument as a PyObject type into argy (Python/C API)
    if (!PyArg_ParseTuple(args, "Oddd", &argy,&dx,&dy,&dz)
    {
        PyErr_SetString(PyExc_ValueError, "Error parsing arguments.");
        return NULL;
    }

    // Determine if it's a complex number array (Numpy/C API)
    int DTYPE = PyArray_ObjectType(argy, NPY_FLOAT); 
    int iscomplex = PyTypeNum_ISCOMPLEX(DTYPE);      

    // parse python object into numpy array (Numpy/C API)
    yarr = (PyArrayObject *)PyArray_FROM_OTF(argy, DTYPE, NPY_ARRAY_IN_ARRAY);
    if (yarr==NULL) {
        Py_INCREF(Py_None);
        return Py_None;
    }

    //just assume this for 3 dimensional array...you can generalize to N dims
    if (PyArray_NDIM(yarr) != 3) {
        Py_CLEAR(yarr);
        PyErr_SetString(PyExc_ValueError, "Expected 3 dimensional integrand");
        return NULL;
    }

    npy_intp * dims = PyArray_DIMS(yarr);
    npy_intp i,j,k,m;
    double * p;

    //initialize variable to hold result
    Py_complex result = {.real = 0, .imag = 0};

    if (iscomplex) {
        for (i=0;i<dims[0];i++) 
            for (j=0;j<dims[1];j++) 
                for (k=0;k<dims[1];k++) {
                    p = (double*)PyArray_GETPTR3(yarr, i,j,k);
                    result.real += *p;
                    result.imag += *(p+1);
                }
    } else {
        for (i=0;i<dims[0];i++) 
            for (j=0;j<dims[1];j++) 
                for (k=0;k<dims[1];k++) {
                    p = (double*)PyArray_GETPTR3(yarr, i,j,k);
                    result.real += *p;
                }
    }

    //multiply by step size
    result.real *= (dx*dy*dz);
    result.imag *= (dx*dy*dz);

    Py_CLEAR(yarr);

    //copy result into returnable type with new reference
    if (iscomplex) {
        return Py_BuildValue("D", &result);
    } else {
        return Py_BuildValue("d", result.real);
    }

};

只需将其放入源文件中（我们将其cintegrate.cxx与模块定义内容一起放在底部）

static PyMethodDef cintegrate_Methods[] = {
    {"integrate3",  integrate3, METH_VARARGS,
     "Pass 3D numpy array (double or complex) and dx,dy,dz step size. Returns Reimman integral"},
    {NULL, NULL, 0, NULL}        /* Sentinel */
};


static struct PyModuleDef module = {
   PyModuleDef_HEAD_INIT,
   "cintegrate",   /* name of module */
   NULL, /* module documentation, may be NULL */
   -1,       /* size of per-interpreter state of the module,
                or -1 if the module keeps state in global variables. */
   cintegrate_Methods
};

然后通过setup.py进行编译，就像Walter的boost示例一样，只是进行了一些明显的更改-用我们的文件file.cc替换cintegrate.cxx，删除boost依赖项，并确保路径到包含"numpy/arrayobject.h"。

在python中，您可以像这样使用它：

import cintegrate
import numpy as np

arr = np.random.randn(4,8,16) + 1j*np.random.randn(4,8,16)

# arbitrary step size dx = 1., y=0.5, dz = 0.25
ans = cintegrate.integrate3(arr, 1.0, 0.5, .25)

此特定代码尚未经过测试，但大部分是从有效代码中复制的。

将大型复杂数组从Python传递到C ++ - 这是我最好的选择吗？

3 个答案: