我有一个模特:
class MyModel(nn.Module):
def __init__(self):
super(MyModel, self).__init__()
self.conv1 = nn.Conv2d(128, 128, (3,3))
self.conv2 = nn.Conv2d(128, 256, (3,3))
self.conv3 = nn.Conv2d(256, 256, (3,3))
def forward(self,):
x = F.relu(self.conv1(x))
x = F.relu(self.conv2(x))
x = F.relu(self.conv3(x))
return x
model = MyModel()
我希望以这样一种方式训练模型,即在每个训练步骤DATA_X1
中都应该训练
['conv1', 'conv2', 'conv3']
层和DATA_X2
应该只训练['conv3']
层。
我尝试制作两个优化器:
# Full parameters train
all_params = model.parameters()
all_optimizer = optim.Adam(all_params, lr=0.01)
# Partial parameters train
partial_params = model.parameters()
for p, (name, param) in zip(list(partial_params), model.named_parameters()):
if name in ['conv3']:
p.requires_grad = True
else:
p.requires_grad = False
partial_optimizer = optim.Adam(partial_params, lr=0.01)
但这会同时影响required_grad = False
的优化器
有什么办法可以做到这一点?
答案 0 :(得分:2)
为什么不将此功能构建到模型中?
class MyModel(nn.Module):
def __init__(self):
super(MyModel, self).__init__()
self.conv1 = nn.Conv2d(128, 128, (3,3))
self.conv2 = nn.Conv2d(128, 256, (3,3))
self.conv3 = nn.Conv2d(256, 256, (3,3))
self.partial_grad = False # a flag
def forward(self, x):
if self.partial_grad:
with torch.no_grad():
x = F.relu(self.conv1(x))
x = F.relu(self.conv2(x))
else:
x = F.relu(self.conv1(x))
x = F.relu(self.conv2(x))
x = F.relu(self.conv3(x))
return x
现在,您可以使用具有所有参数的单个优化器,并且可以根据训练数据打开和关闭model.partial_grad
:
optimizer.zero_grad()
model.partial_grad = False # prep for DATA_X1 training
x1, y1 = DATA_X1.item() # this is not really a code, but you get the point
out = model(x1)
loss = criterion(out, y1)
loss.backward()
optimizer.step()
# do a partial opt for DATA_X2
optimizer.zero_grad()
model.partial_grad = True # prep for DATA_X2 training
x2, y2 = DATA_X2.item() # this is not really a code, but you get the point
out = model(x2)
loss = criterion(out, y2)
loss.backward()
optimizer.step()
使用单个优化程序会更有利,因为您可以跟踪两个数据集的动量和参数的变化。