Question

我正在尝试实现一个区分k个不同类的多类逻辑回归分类器。

这是我的代码。

import numpy as np
from scipy.special import expit


def cost(X,y,theta,regTerm):
    (m,n) = X.shape
    J = (np.dot(-(y.T),np.log(expit(np.dot(X,theta))))-np.dot((np.ones((m,1))-y).T,np.log(np.ones((m,1)) - (expit(np.dot(X,theta))).reshape((m,1))))) / m + (regTerm / (2 * m)) * np.linalg.norm(theta[1:])
    return J

def gradient(X,y,theta,regTerm):
    (m,n) = X.shape
    grad = np.dot(((expit(np.dot(X,theta))).reshape(m,1) - y).T,X)/m + (np.concatenate(([0],theta[1:].T),axis=0)).reshape(1,n)
    return np.asarray(grad)

def train(X,y,regTerm,learnRate,epsilon,k):
    (m,n) = X.shape
    theta = np.zeros((k,n))
    for i in range(0,k):
        previousCost = 0;
        currentCost = cost(X,y,theta[i,:],regTerm)
        while(np.abs(currentCost-previousCost) > epsilon):
            print(theta[i,:])
            theta[i,:] = theta[i,:] - learnRate*gradient(X,y,theta[i,:],regTerm)
            print(theta[i,:])
            previousCost = currentCost
            currentCost = cost(X,y,theta[i,:],regTerm)
    return theta

trX = np.load('trX.npy')
trY = np.load('trY.npy')
theta = train(trX,trY,2,0.1,0.1,4)

我可以验证成本和渐变是否返回正确维度中的值（成本返回标量，渐变返回1乘n行向量），但我得到错误

RuntimeWarning: divide by zero encountered in log
  J = (np.dot(-(y.T),np.log(expit(np.dot(X,theta))))-np.dot((np.ones((m,1))-y).T,np.log(np.ones((m,1)) - (expit(np.dot(X,theta))).reshape((m,1))))) / m + (regTerm / (2 * m)) * np.linalg.norm(theta[1:])

为什么会发生这种情况，我该如何避免这种情况？

Answer 1

您可以通过适当地使用广播来清理公式，对于矢量的点积运算符*，以及用于矩阵乘法的运算符@ - 并按照注释中的建议将其分解。 / p>

以下是您的费用函数：

def cost(X, y, theta, regTerm):
    m = X.shape[0]  # or y.shape, or even p.shape after the next line, number of training set
    p = expit(X @ theta)
    log_loss = -np.average(y*np.log(p) + (1-y)*np.log(1-p))
    J = log_loss + regTerm * np.linalg.norm(theta[1:]) / (2*m)
    return J

您可以沿同一行清理渐变功能。

顺便问一下，你确定要np.linalg.norm(theta[1:])吗？如果您尝试进行L2正则化，则该术语应为np.linalg.norm(theta[1:]) ** 2。

Answer 2

此处的正确解决方案是在log函数的参数中添加一些小的epsilon。对我有用的是

epsilon = 1e-5    

def cost(X, y, theta):
    m = X.shape[0]
    yp = expit(X @ theta)
    cost = - np.average(y * np.log(yp + epsilon) + (1 - y) * np.log(1 - yp + epsilon))
    return cost

Answer 3

我猜测你的数据中有负值。您无法记录否定数据。

import numpy as np
np.log(2)
> 0.69314718055994529
np.log(-2)
> nan

如果是这种情况，有很多不同的方法可以转换您应该提供帮助的数据。

Answer 4

def cost(X, y, theta):
    m = X.shape[0]
    yp = expit(X @ theta)
    cost = - np.average(y * np.log(yp) + (1 - y) * np.log(1 - yp))
    return cost

警告来自np.log(yp)的{{1}}和yp==0的{{1}}。一种选择是过滤掉这些值，而不是将它们传递到np.log(1 - yp)中。另一种选择是添加一个小的常量以防止该值正好为0（如上述注释之一所示）

Answer 5

原因：

之所以会这样，是因为在某些情况下，每当y [i]等于1时，Sigmoid函数（ theta ）的值也等于1。

成本函数：

J = (np.dot(-(y.T),np.log(expit(np.dot(X,theta))))-np.dot((np.ones((m,1))-y).T,np.log(np.ones((m,1)) - (expit(np.dot(X,theta))).reshape((m,1))))) / m + (regTerm / (2 * m)) * np.linalg.norm(theta[1:])

现在，请考虑以上代码段中的以下部分：

np.log(np.ones((m,1)) - (expit(np.dot(X,theta))).reshape((m,1)))

在这里，当theta的值为1时，您正在执行（1- theta）。因此，这将有效地变为log（1-1-1）= log（0），这是未定义的。

Answer 6

将 epsilon 值 [这是一个微型值] 添加到日志值，这样它就不会成为问题。但我不确定它是否会给出准确的结果。

python除以零在log - logistic回归中遇到

6 个答案: