火炬代码产生CUDA运行时错误

时间:2017-06-20 19:01:44

标签: runtime pytorch

我的一个朋友实现了一个实际可行的稀疏版本的torch.bmm,但是当我尝试测试时,我遇到了运行时错误(与此实现无关),我不明白。我已经看到了一些关于是否但无法找到解决方案的主题。这是代码和错误:

if __name__ == "__main__":
     tmp = torch.zeros(1).cuda()
     batch_csr = BatchCSR()
     sparse_bmm = SparseBMM()

     i=torch.LongTensor([[0,5,8], [1,5,8], [2,5,8]])
     v=torch.FloatTensor([4,3,8])
     s=torch.Size([3,500,500])

     indices, values, size = i,v,s

     a_ = torch.sparse.FloatTensor(indices, values, size).cuda().transpose(2, 1)
     batch_size, num_nodes, num_faces = a_.size()

     a = a_.to_dense()

     for _ in range(10):
        b = torch.randn(batch_size, num_faces, 16).cuda()
        torch.cuda.synchronize()
        time1 = time.time()
        result = torch.bmm(a, b)
        torch.cuda.synchronize()
        time2 = time.time()
        print("{} CuBlas dense bmm".format(time2 - time1))

        torch.cuda.synchronize()
        time1 = time.time()
        col_ind, col_ptr = batch_csr(a_.indices(), a_.size())
        my_result = sparse_bmm(a_.values(), col_ind, col_ptr, a_.size(), b)
        torch.cuda.synchronize()
        time2 = time.time()
        print("{} My sparse bmm".format(time2 - time1))

        print("{} Diff".format((result-my_result).abs().max()))

错误:

Traceback (most recent call last):
  File "sparse_bmm.py", line 72, in <module>
    b = torch.randn(3, 500, 16).cuda()
  File "/home/bizeul/virtual_env/lib/python2.7/site-packages/torch/_utils.py", line 65, in _cuda
    return new_type(self.size()).copy_(self, async)
RuntimeError: cuda runtime error (59) : device-side assert triggered at /b/wheel/pytorch-src/torch/lib/THC/generic/THCTensorCopy.c:18

使用命令CUDA_LAUNCH_BLOCKING = 1运行时,出现错误:

/b/wheel/pytorch-src/torch/lib/THC/THCTensorIndex.cu:121: void indexAddSmallIndex(TensorInfo<T, IndexType>, TensorInfo<T, IndexType>, TensorInfo<long, IndexType>, int, int, IndexType, long) [with T = float, IndexType = unsigned int, DstDim = 1, SrcDim = 1, IdxDim = -2]: block: [0,0,0], thread: [0,0,0] Assertion `dstIndex < dstAddDimSize` failed.
THCudaCheck FAIL file=/b/wheel/pytorch-src/torch/lib/THCS/generic/THCSTensorMath.cu line=292 error=59 : device-side assert triggered
Traceback (most recent call last):
  File "sparse_bmm.py", line 69, in <module>
    a = a_.to_dense()
RuntimeError: cuda runtime error (59) : device-side assert triggered at /b/wheel/pytorch-src/torch/lib/THCS/generic/THCSTensorMath.cu:292

1 个答案:

答案 0 :(得分:1)

您传递以创建稀疏张量的索引不正确。

这是应该如何:

i = torch.LongTensor([[0, 1, 2], [5, 5, 5], [8, 8, 8]])

如何创建稀疏张量:

让我们举一个更简单的例子。让我们说我们想要以下张量:

  0   0   0   2   0
  0   0   0   0   0
  0   0   0   0  20
[torch.cuda.FloatTensor of size 3x5 (GPU 0)]

如您所见,数字(2)需要位于稀疏张量的(0,3)位置。数字(20)需要位于(2,4)位置。

为了创建它,我们的索引张量应该看起来像这样

[[0 , 2],
 [3 , 4]]

而且,现在为代码创建上面的稀疏张量:

i=torch.LongTensor([[0, 2], [3, 4]])
v=torch.FloatTensor([2, 20])
s=torch.Size([3, 5])
a_ = torch.sparse.FloatTensor(indices, values, size).cuda()

关于cuda的断言错误的更多评论:

Assertion 'dstIndex < dstAddDimSize' failed.告诉我们,很有可能,你的索引超出范围。因此,每当您注意到这一点时,请查找可能为任何张量提供错误索引的位置。