Question

我的以下代码段：

data / imat是100000 x 500的数据矩阵，而我构建的矩阵S的顺序为50000 x 100000。但矩阵S是超稀疏的，每列只有一个条目

def getSparseCovErr(imat, sketch):
   ATA = np.dot(imat.transpose(), imat)
   BTB = sketch.transpose().dot(sketch)
   fn = np.linalg.norm(imat, 'fro') ** 2
   val = np.linalg.norm(ATA - BTB , 2)/fn
   del ATA
   del BTB
   return val

nrows, ncols = data.shape
samples = noOfSamples(ncols, eps, delta)

cols = np.arange(nrows)
rows = np.random.random_integers(samples - 1, size = nrows)
diag = []
for i in range(len(cols)):
    if np.random.random() < 0.5:
        diag.append(1)
    else:
        diag.append(-1)
S = sparse.csc_matrix((diag, (rows, cols)), shape = (samples, nrows))/np.sqrt(samples)
Q = S.dot(data)

Q = sparse.bsr_matrix(Q)

print getSparseCovErr(data, Q)

当我第一次运行上面的代码时，它给了我print语句输出。之后，如果我再次运行，我会收到以下错误：

python: malloc.c:2369: sysmalloc: Assertion `(old_top == (((mbinptr) (((char *) &((av)->bins[((1) - 1) * 2])) - __builtin_offsetof (struct malloc_chunk, fd)))) && old_size == 0) || ((unsigned long) (old_size) >= (unsigned long)((((__builtin_offsetof (struct malloc_chunk, fd_nextsize))+((2 * (sizeof(size_t))) - 1)) & ~((2 * (sizeof(size_t))) - 1))) && ((old_top)->size & 0x1) && ((unsigned long)old_end & pagemask) == 0)' failed.

然后，如果我再次跑，我会得到类似的东西：

    Q = sparse.bsr_matrix(Q)
  File "/usr/lib64/python2.7/site-packages/scipy/sparse/bsr.py", line 170, in __init__
    arg1 = coo_matrix(arg1, dtype=dtype).tobsr(blocksize=blocksize)
  File "/usr/lib64/python2.7/site-packages/scipy/sparse/coo.py", line 186, in __init__
    self.data  = M[self.row, self.col]
IndexError: index -1517041769959067988 is out of bounds for axis 0 with size 178133
None

在我看来，我做的第一次运行就是创造内存问题。如何调试此问题以及可能存在的问题和解决方案？

Answer 1

这会有用吗？

def getSparseCovErr(imat, sketch):
   return np.linalg.norm(np.dot(imat.transpose(), imat) - sketch.transpose().dot(sketch)) / (np.linalg.norm(imat, 'fro') ** 2)

def getQ(data, rows, cols, diag, samples, nrows):
    return sparse.bsr_matrix((sparse.csc_matrix((diag, (rows, cols)), shape = (samples, nrows))/np.sqrt(samples)).dot(data))

print getSparseCovErr(data, getQ(data, rows, cols, diag, samples, nrows))

也就是说，尽快让事情超出范围。可能是一些括号错误，因为没有这些功能很难测试。

如果没有，我会假设您存储的某个功能实际上正在改变状态/存储数据。

使用原始代码并假设您使用ipython，您可以执行以下操作：

In [5]: %%bash
ps -e -orss=,args= | sort -b -k1,1n | pr -TW$COLUMNS | tail -n 10

监控代码每一步的内存分配，以确定问题。

大numpy / scipy数组的内存问题

1 个答案: