我不明白为什么我的逻辑回归图中有一条垂直线?

时间:2017-09-09 05:30:35

标签: python matplotlib machine-learning visualization logistic-regression

%matplotlib notebook
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

x_df=pd.DataFrame([0.5,0.75,1,1.25,1.5,1.75,1.75,2,2.25,2.5,2.75,3,3.25,3.5,4,4.25,4.5,4.75,5,5.5])
y_df=pd.DataFrame([0,0,0,0,0,0,1,0,1,0,1,0,1,0,1,1,1,1,1,1])
print()

#adding the column one since there is an extra theta value
x_df['intercept']=1

#converting to matrix
X = np.matrix(x_df.values)
print(X)
#
##converting the matrix y
y= np.matrix(y_df.values)
print(y)

#initialize theta
theta = np.matrix(np.array([0,0]))

def sigmoid(x):

    return 1/(1 + np.exp(-x))


def cost(x,y,theta):
    m = y.shape[0]
    h = sigmoid(x * theta.T)
    h1 = np.multiply(y,np.log(h))
    h2 = np.multiply(1- y,np.log(1-h))
    return -np.sum(h1+h2)/(1.0*m)

def gd(x,y,theta,alpha = 0.1,iter=10000):

    m = y.shape[0]

    for i in range(iter):
        h = sigmoid(x * theta.T)
        error = h-y
        update = np.dot(error.T,x)
        theta = theta - ( (alpha*update)/m )

    return theta,cost(x,y,theta),h

new_theta,new_cost,new_h=gd(X,y,theta)

print(np.ravel(new_h).T)

n=np.ravel(new_h).T

n=pd.DataFrame(n)
print(n)

plt.plot(x_df,y_df,'go',x_df,n,'bo')

我花了很多时间尝试在python 3中对逻辑回归进行硬编码,我相信它的代码是正确的。花了那么多时间,当我开始绘制图表时,结果就是这个!

weird logistic regression graph in blue circles

有人可以帮我解释一下代码吗?我很难看到情节(假设函数与X)!

1 个答案:

答案 0 :(得分:0)

将绘图线更改为

plt.plot(x_df,y_df,'go', X[:, 0], new_h,'bo')

plt.plot(x_df,y_df,'go', x_df.ix[:, 0], n,'bo')

您的问题是x_df有两列,一列称为0,另一列称为intercept(全部为1),两者都绘制为x轴。 / p>