在此GAN tutorial中,如果您向下滚动到训练循环,则可以看到它们结合了渐变
errD = errD_real + errD_fake
这样。其中errD_real = criterion(output, label)
和errD_fake = criterion(output, label)
和criterion = nn.BCELoss()
。我想做同样的事情,但是在执行向后传递之前,我想将两个梯度标准化为两者的较低欧几里德范数。我该怎么办?
我知道我可以通过打印出netD.weight.grad
来分别在netD上访问每个权重的梯度,但是有什么方法可以将它们批量归一化为两者中较低的欧几里得范数?
这是我正在谈论的训练循环的一部分:
for epoch in range(num_epochs):
# For each batch in the dataloader
for i, data in enumerate(dataloader, 0):
############################
# (1) Update D network: maximize log(D(x)) + log(1 - D(G(z)))
###########################
## Train with all-real batch
netD.zero_grad()
# Format batch
real_cpu = data[0].to(device)
b_size = real_cpu.size(0)
label = torch.full((b_size,), real_label, device=device)
# Forward pass real batch through D
output = netD(real_cpu).view(-1)
# Calculate loss on all-real batch
errD_real = criterion(output, label)
# Calculate gradients for D in backward pass
errD_real.backward()
D_x = output.mean().item()
## Train with all-fake batch
# Generate batch of latent vectors
noise = torch.randn(b_size, nz, 1, 1, device=device)
# Generate fake image batch with G
fake = netG(noise)
label.fill_(fake_label)
# Classify all fake batch with D
output = netD(fake.detach()).view(-1)
# Calculate D's loss on the all-fake batch
errD_fake = criterion(output, label)
# Calculate the gradients for this batch
errD_fake.backward()
D_G_z1 = output.mean().item()
# Add the gradients from the all-real and all-fake batches
errD = errD_real + errD_fake
# Update D
optimizerD.step()
...