我在理解cs231n HW1的某些代码导致svm丢失时遇到了一些麻烦。损失的表达式为scores[j] - correct_class_score + 1
当我们仅针对正确的分数采用渐变时,是否仅更新正确类别的索引?因此渐变为
dW[:, y[i]]=dW[:, y[i]] - X[i, :y[i]]*num_classes_greater_margin #and not
dW[:, y[i]]=dW[:, y[i]] - X[i, :]
对于不正确的分数也是如此。为什么他要做*num_classes_greater_margin
?
def svm_loss_naive(W, X, y, reg):
"""
Structured SVM loss function, naive implementation (with loops).
Inputs have dimension D, there are C classes, and we operate on minibatches
of N examples.
Inputs:
- W: A numpy array of shape (D, C) containing weights.
- X: A numpy array of shape (N, D) containing a minibatch of data.
- y: A numpy array of shape (N,) containing training labels; y[i] = c means
that X[i] has label c, where 0 <= c < C.
- reg: (float) regularization strength
Returns a tuple of:
- loss as single float
- gradient with respect to weights W; an array of same shape as W
"""
# Initialize loss and the gradient of W to zero.
dW = np.zeros(W.shape)
loss = 0.0
num_classes = W.shape[1]
num_train = X.shape[0]
# Compute the data loss and the gradient.
for i in range(num_train): # For each image in training.
scores = X[i].dot(W)
correct_class_score = scores[y[i]]
num_classes_greater_margin = 0
for j in range(num_classes): # For each calculated class score for this image.
# Skip if images target class, no loss computed for that case.
if j == y[i]:
continue
# Calculate our margin, delta = 1
margin = scores[j] - correct_class_score + 1
# Only calculate loss and gradient if margin condition is violated.
if margin > 0:
num_classes_greater_margin += 1
# Gradient for non correct class weight.
dW[:, j] = dW[:, j] + X[i, :]
loss += margin
# Gradient for correct class weight.
dW[:, y[i]] = dW[:, y[i]] - X[i, :]*num_classes_greater_margin
# Average our data loss across the batch.
loss /= num_train
# Add regularization loss to the data loss.
loss += reg * np.sum(W * W)
# Average our gradient across the batch and add gradient of regularization term.
dW = dW /num_train + 2*reg *W
return loss, dW