Identify contiguous segments of a non-contiguous numpy array

时间:2015-07-28 15:50:31

标签: python numpy pycuda

In the example below, we have a contiguous array, and a view of the same array that is non-contiguous:

shape = (5, 100)
A = np.arange(np.product(shape)).reshape(shape)

# Everything is contiguous at this point
assert A.flags.c_contiguous == True

# Since we're taking a view over the
# row minor dimension, we'll have 5
# segments of 50 contiguous elements
A[:,20:70].flags.c_contiguous == False

# Each segment is contiguous
A[0,20:70].flags.c_contiguous == True
A[1,20:70].flags.c_contiguous == True
A[2,20:70].flags.c_contiguous == True
A[3,20:70].flags.c_contiguous == True
A[4,20:70].flags.c_contiguous == True

The view references 5 segments of 50 contiguous elements.

If we look at a more general case

shape = (30, 20, 70, 50)
B = np.arange(np.product(shape)).reshape(shape)

then, the following holds:

B[0,:,:,:].flags.c_contiguous == True
B[0,0,:,:].flags.c_contiguous == True
B[0,0,0,:].flags.c_contiguous == True

A partial solution, based on _UpdateContiguousFlags in flagobjects.c might be:

def is_contiguous(ary):
    is_c_contig = True
    sd = ary.itemsize
    for i in reversed(range(ary.ndim)):
        dim = ary.shape[i]

        # Contiguous by default
        if dim == 0:
            return True

        if dim != 1:
            if ary.strides[i] != sd:
                is_c_contig = False

            sd *= dim

    return is_c_contig

But I don't think this handles cases where the array is transposed. e.g.:

B.transpose(3,0,1,2)

Question: Is there a method of identifying the contiguous segments in a general NumPy view/array?

I need this functionality in order to transfer data from a NumPy array to a GPU without copying data into pinned memory. Instead, I'd like to pin contiguous segments using cudaHostRegister and then do the transfer to the GPU. I'm aware of CUDA's support for pitched memory via MemCpy3D, but I'd like to handle more general cases.

Edit 1: Added a basic solution to the question, and asked about the transpose case.

Edit 2: Clarified removing the need to copy data into pinned memory.

0 个答案:

没有答案