我为BO使用Ax,最终通过Pytorch通过Botorch和GPytorch。在某个时候,调用loss.backward()
时会出现以下错误:
File ".../model.py", line 496, in train_loop
loss.backward()
File ".../torch/tensor.py", line 166, in backward
torch.autograd.backward(self, gradient, retain_graph, create_graph)
File ".../torch/autograd/__init__.py", line 99, in backward
allow_unreachable=True) # allow_unreachable flag
RuntimeError: at::cuda::blas::gemm<float> argument ldb must be positive and less than 2147483647 but got 0
查看原始文件CUDABlass.cpp,该错误对应于正整数检查:
#define CUDABLAS_POSINT_CHECK(FD, X) \
TORCH_CHECK( \
(X > 0 && X <= INT_MAX), \
"at::cuda::blas::" #FD " argument " #X \
" must be positive and less than ", \
INT_MAX, \
" but got ", \
X)
我不知道错误的来源。我怀疑有一些Search Space
变量,但我看不到它们有任何错误(或者相同,问题与Ax,Botorch或Gpytorch无关)。有人遇到过此错误吗?你怎么解决的?