在Cython中创建一个不等长的numpy.ndarray列表

时间:2014-05-24 03:49:09

标签: python arrays object numpy cython

我现在有python代码来创建一个ndarrays列表,这些数组的长度不相等。这段代码片段如下所示:

import numpy as np
from mymodule import list_size, array_length # list_size and array_length are two lists of ints, and the len(array_length) == list_size

ndarray_list = []

for i in range(list_size):
    ndarray_list.append(np.zeros(array_length[i]))

现在,我需要将其转换为Cython,但不知道如何。我试图创建一个2-d动态分配的数组,如下所示:

import numpy as np
cimport numpy as np
from mymodule import list_size, array_length

cdef int i
ndarray_list = <double **>malloc(list_size * sizeof(double*))
for i in range(list_size):
    ndarray_list[i] = <double *>malloc(array_length[i] * sizeof(double))

但是,此方法仅在ndarray_list [i]中创建双指针。我无法将其传递给需要某些ndarray方法的其他函数。

我该怎么办?

2 个答案:

答案 0 :(得分:4)

为了将C double*缓冲区传递给需要numpy.ndarray的函数,您可以创建一个临时缓冲区并为其内存地址分配double*数组的地址。 / p>

这个基于malloc()的解决方案比基于NumPy缓冲区的其他答案快几个数量级。注意如何free()内部数组以避免内存泄漏。

import numpy as np
cimport numpy as np
from cython cimport view
from libc.stdlib cimport malloc, free

cdef int i
cdef double test
list_size = 10
ndarray_list = <double **>malloc(list_size * sizeof(double*))
array_length = <int *>malloc(list_size * sizeof(int*))
for i in range(list_size):
    array_length[i] = i+1
    ndarray_list[i] = <double *>malloc(array_length[i] * sizeof(double))
    for j in range(array_length[i]):
        ndarray_list[i][j] = j

for i in range(list_size):
    for j in range(array_length[i]):
        test = ndarray_list[i][j]

cdef view.array buff
for i in range(list_size):
    buff = <double[:array_length[i]]>ndarray_list[i]
    print np.sum(buff)

#...

for i in range(list_size):
    free(ndarray_list[i])
free(ndarray_list)
free(array_length)

答案 1 :(得分:2)

您可以将object类型与基于NumPy的缓冲区一起使用。要有效填充ndarray_list,您只需要一个object缓冲区,但请注意,对np.zeros()的多次调用可能会导致某些缓慢:

cdef int i, list_size
cdef np.ndarray[np.int_t, ndim=1] array_length
cdef np.ndarray[object, ndim=1] ndarray_list

list_size = 10000
array_length = np.arange(list_size).astype(np.int)+1

ndarray_list = np.empty(list_size, dtype=object)
for i in range(list_size):
    ndarray_list[i] = np.zeros(array_length[i], dtype=np.float64)

要有效地访问内部数组,需要另一个1-D缓冲区:

cdef np.ndarray[np.float64_t, ndim=1] inner_array
cdef double test
cdef int j

for i in range(list_size):
    inner_array = ndarray_list[i]
    for j in range(inner_array.shape[0]):
        test = inner_array[j]