我正在使用下一个cost()
和gradient()
正则化函数:
def cost(theta, x, y, lam):
theta = theta.reshape(1, len(theta))
predictions = sigmoid(np.dot(x, np.transpose(theta))).reshape(len(x), 1)
regularization = (lam / (len(x) * 2)) * np.sum(np.square(np.delete(theta, 0, 1)))
complete = -1 * np.dot(np.transpose(y), np.log(predictions)) \
- np.dot(np.transpose(1 - y), np.log(1 - predictions))
return np.sum(complete) / len(x) + regularization
def gradient(theta, x, y, lam):
theta = theta.reshape(1, len(theta))
predictions = sigmoid(np.dot(x, np.transpose(theta))).reshape(len(x), 1)
theta_without_intercept = theta.copy()
theta_without_intercept[0, 0] = 0
assert(theta_without_intercept.shape == theta.shape)
regularization = (lam / len(x)) * np.sum(theta_without_intercept)
return np.sum(np.multiply((predictions - y), x), 0) / len(x) + regularization
使用这些函数和scipy.optimize.fmin_bfgs()
我得到下一个输出(这几乎是正确的):
Starting loss value: 0.69314718056
Warning: Desired error not necessarily achieved due to precision loss.
Current function value: 0.208444
Iterations: 8
Function evaluations: 51
Gradient evaluations: 39
7.53668131651e-08
Trained loss value: 0.208443907192
下面的Reguarization公式。如果我评论scipy.optimize.fmin_bfgs()
以上的正则化输入效果很好,并且正确返回局部最优值。
任何想法为什么?
更新:
在补充评论之后,我更新了成本和梯度正则化(在上面的代码中)。但是这个警告仍然出现(上面的新输出)。 scipy check_grad
函数返回下一个值:7.53668131651e-08。
更新2:
我正在使用set UCI Machine Learning Iris
数据。并基于分类模型One-vs-All
训练Iris-setosa
的第一个结果。
答案 0 :(得分:2)
当您尝试执行L2正则化时,您应该修改成本函数中的值
regularization = (lam / len(x) * 2) * np.sum(np.square(np.delete(theta, 0, 1)))
到
regularization = (lam / (len(x) * 2)) * np.sum(np.square(np.delete(theta, 0, 1)))
此外,正则化的梯度部分应具有与参数theta
的向量相同的形状。因此,我认为正确的值将是
theta_without_intercept = theta.copy()
theta_without_intercept[0] = 0 # You are not penalizing the intercept in your cost function, i.e. theta_0
assert(theta_without_intercept.shape == theta.shape)
regularization = (lam / len(x)) * theta_without_intercept
否则,渐变不会是正确的。然后,您可以使用scipy.optimize.check_grad()
函数检查渐变是否正确。
答案 1 :(得分:0)
问题出在我的微积分中,由于某种原因,我在正则化中总结了regularization = (lam / len(x)) * np.sum(theta_without_intercept)
个值:def gradient(theta, x, y, lam):
theta_len = len(theta)
theta = theta.reshape(1, theta_len)
predictions = sigmoid(np.dot(x, np.transpose(theta))).reshape(len(x), 1)
theta_wo_bias = theta.copy()
theta_wo_bias[0, 0] = 0
assert (theta_wo_bias.shape == theta.shape)
regularization = np.squeeze(((lam / len(x)) *
theta_wo_bias).reshape(theta_len, 1))
return np.sum(np.multiply((predictions - y), x), 0) / len(x) + regularization
。在此阶段我们不需要 np.sum 正则化值。这将为每个θ和下一个预测损失产生avaregae正则化。无论如何,谢谢你的帮助。
渐变方法:
Starting loss value: 0.69314718056
Optimization terminated successfully.
Current function value: 0.201681
Iterations: 30
Function evaluations: 32
Gradient evaluations: 32
7.53668131651e-08
Trained loss value: 0.201680992316
输出:
UPDATE bt
SET bt.parent_id = pc.parent_id
FROM businesstable bt
INNER JOIN parent_customer pc ON pc.ref_id = bt.ref_id
INNER JOIN match_table_cm mt ON mt.main_id = pc.parent_id