Question

我正在用numpy在Python中实现逻辑回归。我生成了以下数据集：

# class 0:
# covariance matrix and mean
cov0 = np.array([[5,-4],[-4,4]])
mean0 = np.array([2.,3])
# number of data points
m0 = 1000

# class 1
# covariance matrix
cov1 = np.array([[5,-3],[-3,3]])
mean1 = np.array([1.,1])
# number of data points
m1 = 1000

# generate m gaussian distributed data points with
# mean and cov.
r0 = np.random.multivariate_normal(mean0, cov0, m0)
r1 = np.random.multivariate_normal(mean1, cov1, m1)

X = np.concatenate((r0,r1))

现在，我已通过以下方法实现了Sigmoid函数：

def logistic_function(x):
    """ Applies the logistic function to x, element-wise. """
    return 1.0 / (1 + np.exp(-x))

def logistic_hypothesis(theta):
    return lambda x : logistic_function(np.dot(generateNewX(x), theta.T))

def generateNewX(x):
    x = np.insert(x, 0, 1, axis=1)
    return x

应用logistic回归后，我发现最好的theta是：

best_thetas = [-0.9673200946417307, -1.955812236119612, -5.060885703369424]

但是，当我将逻辑函数与这些theta一起使用时，输出是不在区间[0,1]内的数字

示例：

data = logistic_hypothesis(np.asarray(best_thetas))(X)
print(data

这将产生以下结果：

[2.67871968e-11 3.19858822e-09 3.77845881e-09 ... 5.61325410e-03
 2.19767618e-01 6.23288747e-01]

有人可以帮助我了解我的实施出现了什么问题吗？我不明白为什么我得到如此大的价值。乙状结肠功能不是应该只在[0,1]间隔内给出结果吗？

Answer 1

是的，它只是在scientific notation中。

'e'指数表示法。使用科学记数法打印数字字母“ e”表示指数。

>>> a = [2.67871968e-11, 3.19858822e-09, 3.77845881e-09, 5.61325410e-03]
>>> [0 <= i <= 1 for i in a]
[True, True, True, True]

为什么我的Sigmoid函数返回的值不在区间[0,1 []中？

1 个答案: