我是菜鸟,并且是第一次在PyTorch中创建模型。我正在尝试创建卷积自动编码器,并且在运行模型时遇到错误。我正在使用的代码是:
class MyDataset(Dataset):
def __init__(self, image_paths, target_paths, train=True):
self.image_paths = image_paths
self.target_paths = target_paths
def transform(self, image, target):
# Transform to tensor
resize = transforms.Resize(size=(2350,1650))
image = resize(image)
target = resize(target)
grayscale = transforms.Grayscale(1)
image = grayscale(image)
target = grayscale(target)
image = TF.to_tensor(image)
target = TF.to_tensor(target)
return image, target
def __getitem__(self, index):
image = Image.open(self.image_paths[index])
target = Image.open(self.target_paths[index])
x, y = self.transform(image, target)
return x, y
def __len__(self):
return len(self.image_paths)
traindata = MyDataset(image_paths=train_data, target_paths=target_data, train=True)
testdata = MyDataset(image_paths=test_data, target_paths=None, train=False)
train_loader = DataLoader(traindata, batch_size=1, shuffle=True, num_workers=4)
test_loader = DataLoader(testdata, batch_size=1, shuffle=False, num_workers=4)
class ConvolutionalAutoEncoder(nn.Module):
def __init__(self):
super(ConvolutionalAutoEncoder, self).__init__()
self.encoder_block1 = nn.Sequential(
nn.Conv2d(1, 64, 3, stride=1, padding=1),
nn.ReLU(True),
nn.Conv2d(64, 64, 3, stride=1, padding=1),
nn.ReLU(True)
)
self.decoder_block1 = nn.Sequential(
nn.ConvTranspose2d(64, 64, 3, stride=1, padding=1),
nn.ReLU(True),
nn.ConvTranspose2d(64, 64, 3, stride=1, padding=1),
nn.ReLU(True)
)
self.decoder_block0 = nn.Sequential(
nn.ConvTranspose2d(64, 1, 3, stride=1, padding=1),
nn.Sigmoid()
)
def forward(self, x):
x1 = self.encoder_block1(x)
y1 = self.decoder_block1(x1)
y0 = self.decoder_block0(y1)
return x
device = torch.device("cuda:2" if torch.cuda.is_available() else "cpu")
print(device)
model = ConvolutionalAutoEncoder().to(device)
# Loss and optimizer
learning_rate = 0.001
criterion = nn.MSELoss()
optimizer = optim.Adam(model.parameters(), lr=learning_rate)
params = list(model.parameters())
print(len(params))
print(params[0].size()) # conv1's .weight
num_epochs = 30
total_step = len(train_loader)
for epoch in range(num_epochs):
for batch_idx, data in enumerate(train_loader):
inp, targ = data
inp = inp.to(device)
targ = targ.to(device)
output = model(inp)
loss = criterion(output, targ)
model.zero_grad()
loss.backward()
optimizer.step()
# print statistics
running_loss += loss.item()
if (batch_idx+1) % 10 == 0:
print ('Epoch [{}/{}], Step [{}/{}], Loss: {:.4f}'
.format(epoch+1, num_epochs, i+1, total_step, loss.item()))
我得到的全部错误是:
---------------------------------------------------------------------------
RuntimeError Traceback (most recent call last)
<ipython-input-17-28fa0c94d845> in <module>
13
14 model.zero_grad()
---> 15 loss.backward()
16 optimizer.step()
17
~/anaconda3/envs/gautam_new/lib/python3.6/site-packages/torch/tensor.py in backward(self, gradient, retain_graph, create_graph)
91 products. Defaults to ``False``.
92 """
---> 93 torch.autograd.backward(self, gradient, retain_graph, create_graph)
94
95 def register_hook(self, hook):
~/anaconda3/envs/gautam_new/lib/python3.6/site-packages/torch/autograd/__init__.py in backward(tensors, grad_tensors, retain_graph, create_graph, grad_variables)
88 Variable._execution_engine.run_backward(
89 tensors, grad_tensors, retain_graph, create_graph,
---> 90 allow_unreachable=True) # allow_unreachable flag
91
92
RuntimeError: element 0 of tensors does not require grad and does not have a grad_fn
请帮助。另外,如果可能的话,还提供有关如何使我的模型更深入的建议。我不断收到CUDA内存不足错误。
谢谢。
答案 0 :(得分:1)
我无法测试您的模型,但是考虑到错误消息,您的问题的原因才是您的forward
的返回值。
当前您将返回x
,它是您的实际输入而不是输出:
def forward(self, x):
x1 = self.encoder_block1(x)
y1 = self.decoder_block1(x1)
y0 = self.decoder_block0(y1)
return x
因此,要返回输出,您可能需要将返回值形式从x
更改为y0
:
def forward(self, x):
x1 = self.encoder_block1(x)
y1 = self.decoder_block1(x1)
y0 = self.decoder_block0(y1)
return y0
关于内存:
请不要在一个问题中放入太多问题。假设您在一个问题中有三个完全不同的问题,并且有三个人都可以解决您的一个问题,那么您最终可能会无人回答。
因为它们都不能够为您提供解决所有这些问题的完整答案。
但是,如果将您的问题分为三个问题,则可能只能得到三个答案,可以解决所有问题。在许多情况下,它也可以改善问题,因为可以在不编写整本小说的情况下更具体地解决问题。
当然,如果您的问题非常相关,则可以将它们放在一个问题中,但这
我想您的forward
函数仍有一些副作用会导致内存问题(很奇怪-不确定)对这个)。因此,如果幸运的话,它也可以解决您的内存问题,但是如果没有,您绝对应该对此提出一个新问题。