Epoch: 1 Training Loss: 0.816370 Validation Loss: 0.696534
Validation loss decreased (inf --> 0.696534). Saving model ...
Epoch: 2 Training Loss: 0.507756 Validation Loss: 0.594713
Validation loss decreased (0.696534 --> 0.594713). Saving model ...
Epoch: 3 Training Loss: 0.216438 Validation Loss: 1.119294
Epoch: 4 Training Loss: 0.191799 Validation Loss: 0.801231
Epoch: 5 Training Loss: 0.111334 Validation Loss: 1.753786
Epoch: 6 Training Loss: 0.064309 Validation Loss: 1.348847
Epoch: 7 Training Loss: 0.058158 Validation Loss: 1.839139
Epoch: 8 Training Loss: 0.015489 Validation Loss: 1.370469
Epoch: 9 Training Loss: 0.082856 Validation Loss: 1.701200
Epoch: 10 Training Loss: 0.003859 Validation Loss: 2.657933
Epoch: 11 Training Loss: 0.018133 Validation Loss: 0.593986
Validation loss decreased (0.594713 --> 0.593986). Saving model ...
Epoch: 12 Training Loss: 0.160197 Validation Loss: 1.499911
Epoch: 13 Training Loss: 0.012942 Validation Loss: 1.879732
Epoch: 14 Training Loss: 0.002037 Validation Loss: 2.399405
Epoch: 15 Training Loss: 0.035908 Validation Loss: 1.960887
Epoch: 16 Training Loss: 0.051137 Validation Loss: 2.226335
Epoch: 17 Training Loss: 0.003953 Validation Loss: 2.619108
Epoch: 18 Training Loss: 0.000381 Validation Loss: 2.746541
Epoch: 19 Training Loss: 0.094646 Validation Loss: 3.555713
Epoch: 20 Training Loss: 0.022620 Validation Loss: 2.833098
Epoch: 21 Training Loss: 0.004800 Validation Loss: 4.181845
Epoch: 22 Training Loss: 0.014128 Validation Loss: 1.933705
Epoch: 23 Training Loss: 0.026109 Validation Loss: 2.888344
Epoch: 24 Training Loss: 0.000768 Validation Loss: 3.029443
Epoch: 25 Training Loss: 0.000327 Validation Loss: 3.079959
Epoch: 26 Training Loss: 0.000121 Validation Loss: 3.578420
Epoch: 27 Training Loss: 0.148478 Validation Loss: 3.297387
Epoch: 28 Training Loss: 0.030328 Validation Loss: 2.218535
Epoch: 29 Training Loss: 0.001673 Validation Loss: 2.934132
Epoch: 30 Training Loss: 0.000253 Validation Loss: 3.215722
我的损失没有收敛。我正在研究“马vs人”数据集。为此,在张量流中有一个official notebook,它像一个咒语一样工作。当我尝试用pytorch复制相同内容时,损失并未收敛。你能看看吗?
我正在使用criterion = nn.BCEWithLogitsLoss()
和optimizer = optim.RMSprop(model.parameters(), lr=0.001)
。尽管它似乎对训练损失有一定影响,但是验证损失看起来像是随机数,没有形成任何模式。亏损未收敛的可能原因是什么?
这是我的CNN架构:
class Net(nn.Module):
def __init__(self):
super(Net, self).__init__()
# convolutional layer (sees 298x298x3 image tensor)
self.conv1 = nn.Conv2d(3, 16, 3)
# convolutional layer (sees 147x147x16 tensor)
self.conv2 = nn.Conv2d(16, 32, 3)
# convolutional layer (sees 71x71x32 tensor)
self.conv3 = nn.Conv2d(32, 64, 3)
# convolutional layer (sees 33x33x64 tensor)
self.conv4 = nn.Conv2d(64, 64, 3)
# convolutional layer (sees 14x14x64 tensor)
self.conv5 = nn.Conv2d(64, 64, 3)
# max pooling layer
self.pool = nn.MaxPool2d(2, 2)
# linear layer (64 * 7 * 7 -> 500)
self.fc1 = nn.Linear(3136, 512)
# linear layer (512 -> 1)
self.fc2 = nn.Linear(512, 1)
# dropout layer (p=0.25)
self.dropout = nn.Dropout(0.25)
def forward(self, x):
# add sequence of convolutional and max pooling layers
x = self.pool(F.relu(self.conv1(x)))
x = self.pool(F.relu(self.conv2(x)))
x = self.pool(F.relu(self.conv3(x)))
x = self.pool(F.relu(self.conv4(x)))
x = self.pool(F.relu(self.conv5(x)))
# flatten image input
x = x.view(-1, 64 * 7 * 7)
# add dropout layer
x = self.dropout(x)
# add 1st hidden layer, with relu activation function
x = F.relu(self.fc1(x))
# add dropout layer
x = self.dropout(x)
# add 2nd hidden layer
x = self.fc2(x)
return x
This is the complete jupyter notebook。抱歉无法创建最少的可复制示例代码。
答案 0 :(得分:0)
我认为问题出在dataloaders
,在这里我注意到,您没有在这里将samplers
传递给loaders
:
# define samplers for obtaining training and validation batches
train_sampler = SubsetRandomSampler(train_idx)
valid_sampler = SubsetRandomSampler(valid_idx)
train_loader = torch.utils.data.DataLoader(
train_dataset,
batch_size=16,
num_workers=0,
shuffle=True
)
test_loader = torch.utils.data.DataLoader(
test_dataset,
batch_size=16,
num_workers=0,
shuffle=True
)
我从没使用过Samplers
,所以现在我不知道如何正确使用它们,但是我想您想这样做:
# define samplers for obtaining training and validation batches
train_sampler = SubsetRandomSampler(train_idx)
valid_sampler = SubsetRandomSampler(valid_idx)
train_loader = torch.utils.data.DataLoader(
train_dataset,
sampler=train_sampler,
batch_size=16,
num_workers=0,
shuffle=True
)
test_loader = torch.utils.data.DataLoader(
test_dataset,
sampler=valid_sampler,
batch_size=16,
num_workers=0,
shuffle=True
)
根据文档:
sampler(采样器,可选)–定义从数据集中抽取样本的策略。如果指定,则shuffle必须为False。
如果您使用采样器,则应关闭随机播放。