在numpy数组上执行操作时出现内存错误

时间:2019-10-05 21:54:54

标签: python arrays numpy bigdata

我正在执行一个操作,该操作涉及对更高维度的numpy数组进行不同的操作(减法,平方,广播)。我的代码给出了执行此类操作的Memory Error

我的下面的代码-

from skimage.segmentation import find_boundaries

w0 = 10
sigma = 5

def make_weight_map(masks):
    """
    Generate the weight maps as specified in the UNet paper
    for a set of binary masks.

    Parameters
    ----------
    masks: array-like
        A 3D array of shape (n_masks, image_height, image_width),
        where each slice of the matrix along the 0th axis represents one binary mask.

    Returns
    -------
    array-like
        A 2D array of shape (image_height, image_width)

    """
    masks = masks.numpy()
    nrows, ncols = masks.shape[1:]
    masks = (masks > 0).astype(int)
    distMap = np.zeros((nrows * ncols, masks.shape[0]))
    X1, Y1 = np.meshgrid(np.arange(nrows), np.arange(ncols))
    X1, Y1 = np.c_[X1.ravel(), Y1.ravel()].T

    #In the below for loop, I am getting the Memory Error
    for i, mask in enumerate(masks):
        # find the boundary of each mask,
        # compute the distance of each pixel from this boundary
        bounds = find_boundaries(mask, mode='inner')
        X2, Y2 = np.nonzero(bounds)
        xSum = (X2.reshape(-1, 1) - X1.reshape(1, -1)) ** 2
        ySum = (Y2.reshape(-1, 1) - Y1.reshape(1, -1)) ** 2
        distMap[:, i] = np.sqrt(xSum + ySum).min(axis=0)
    ix = np.arange(distMap.shape[0])
    if distMap.shape[1] == 1:
        d1 = distMap.ravel()
        border_loss_map = w0 * np.exp((-1 * (d1) ** 2) / (2 * (sigma ** 2)))
    else:
        if distMap.shape[1] == 2:
            d1_ix, d2_ix = np.argpartition(distMap, 1, axis=1)[:, :2].T
        else:
            d1_ix, d2_ix = np.argpartition(distMap, 2, axis=1)[:, :2].T
        d1 = distMap[ix, d1_ix]
        d2 = distMap[ix, d2_ix]
        border_loss_map = w0 * np.exp((-1 * (d1 + d2) ** 2) / (2 * (sigma ** 2)))
    xBLoss = np.zeros((nrows, ncols))
    xBLoss[X1, Y1] = border_loss_map
    # class weight map
    loss = np.zeros((nrows, ncols))
    w_1 = 1 - masks.sum() / loss.size
    w_0 = 1 - w_1
    loss[masks.sum(0) == 1] = w_1
    loss[masks.sum(0) == 0] = w_0
    ZZ = xBLoss + loss
    return ZZ

要重现问题,一个大小为4,584, 565的数字数组可以重现问题。

错误的跟踪-

---------------------------------------------------------------------------
MemoryError                               Traceback (most recent call last)
<ipython-input-32-0f30ef7dc24d> in <module>
----> 1 img = make_weight_map(img)

<ipython-input-31-e75a6281476f> in make_weight_map(masks)
     34         xSum = (X2.reshape(-1, 1) - X1.reshape(1, -1)) ** 2
     35         ySum = (Y2.reshape(-1, 1) - Y1.reshape(1, -1)) ** 2
---> 36         distMap[:, i] = np.sqrt(xSum + ySum).min(axis=0)
     37     ix = np.arange(distMap.shape[0])
     38     if distMap.shape[1] == 1:

MemoryError:

使用4,584, 565输入时,形状为-

X1.shape
(329960,)
X2.shape,Y2.shape# for first iteration
(15239,) (15239,)

此行(X2.reshape(-1, 1) - X1.reshape(1, -1))中出现了主要问题,因为X2的形状为(15239,1),而X1的形状为(1,329960),因此首先必须执行庞大的广播操作。

distmap,我无法计算,因为在此之前我的代码暂停了。
另外,如果我尝试对上述尺寸执行以下减法运算,则代码也在那里暂停。

X2.reshape(-1, 1) - X1.reshape(1, -1)

我正在使用32 Gb RAM的系统,我也尝试在64 Gb RAM的云上运行。 我已经检查了以下问题,或者他们没有为我的问题提供解决方案,或者我无法应用于我的用例。

Python/Numpy MemoryError
Working with big data in python and numpy, not enough ram, how to save partial results on disc?
Memory growth with broadcast operations in NumPy

0 个答案:

没有答案