Python Logistic回归/黑森州。得到零除误差和奇异矩阵误差

时间:2019-10-26 01:25:54

标签: python regression gradient logistic-regression hessian

代码:

def sigmoid(x):
    return 1.0/(1+np.exp(-x)) 

def cost(x,y,th):
    pro = sigmoid(np.dot(x,th))
    result = sum(-y * np.log(pro) - (1-y) * np.log(1-pro))   
    result = result/len(x) #len: number of feature rows
    return result

def gradient(x,y,th):
    xTrans = x.transpose()                                      
    sig = sigmoid(np.dot(x,th))                              
    grad = np.dot(xTrans, ( sig - y ))                          
    grad = grad / len(x) #len: number of feature rows  
    return grad
def hessian(x,y,th):
    xTrans = x.transpose()                                      
    sig = sigmoid(np.dot(x,th))                              
    result = (1.0/len(x) * np.dot(xTrans, x) * np.diag(sig) * np.diag(1 - sig) )   
    return result
def updateTh(x,y,th):
    hessianInv = np.linalg.inv(hessian(x,y,th))                         
    grad = gradient(x,y,th)                                  
    th = th - np.dot(hessianInv, grad)                     
    return th

m = 80 #number of x rows
x = np.ones([m,3])
y = np.empty([m,1], dtype = int)
th = np.zeros([3,1])
hessianResult = np.identity(3) #identity 3x3


with open('exam.csv','r') as csvfile:
            i = 0
            reader = csv.reader(csvfile)
            next(reader) #skip header            
            for line in reader:
                x[i][1] = line[0]
                x[i][2] = line[1]
                y[i][0] = line[2]
                i+=1

#m = x.shape[0] #number of x rows
for i in range(10):
    costResult = cost(x,y,th)
    hessianResult = hessian(x,y,th)
    grad = gradient(x,y,th)
    th = updateTh(x,y,th)  

如果循环超过28次,则成本函数的“求和”部分将被除以0,并且还会收到一个错误,指出矩阵不可逆,因为它是奇异的。按照我的教授给出的确切算法,不确定是什么错误。该数据集包含80个学生条目的列表,每个条目有两个考试分数,以及该学生是否被大学录取的二进制1或0。

2 个答案:

答案 0 :(得分:0)

首先,您可以从读取文件的行中删除i=0i+=1。只需将for line in reader:替换为for i, line in enumerate(reader):

enumerate将使i从0开始并在每一行增加。

但是,我无法在您的代码中发现任何错误,因此我的猜测是变量的初始化存在一些问题。

尤其是:您要求x(m,3)形状的数组,但仅初始化两列。另外,您将th设为开头的零矩阵。然后,您要做的第一件事就是计算成本,这意味着首先要计算np.dot(x,th)。恐怕这将与您的教授提供给您的数据无关,因为th充满了零。这也可以解释为什么总是在同一迭代中得到错误。

答案 1 :(得分:0)

将其用作.csv(80个条目,不包括标题)的示例数据:

Grade 1,Grade 2,Admit
83,95,1
87,93,1
92,91,1
94,88,0
81,97,0
88.3,92.5,1
88.6,92.4,0
88.9,92.3,0
89.2,92.2,0
89.5,92.1,0
89.8,92,1
90.1,91.9,1
90.4,91.8,1
90.7,91.7,1
91,91.6,0
91.3,91.5,0
91.6,91.4,1
91.9,91.3,0
92.2,91.2,0
92.5,91.1,0
92.8,91,0
93.1,90.9,1
93.4,90.8,1
93.7,90.7,1
94,90.6,0
94.3,90.5,0
94.6,90.4,1
94.9,90.3,0
95.2,90.2,0
95.5,90.1,0
95.8,90,0
96.1,89.9,1
96.4,89.8,1
96.7,89.7,0
97,89.6,0
97.3,89.5,0
97.6,89.4,1
97.9,89.3,1
98.2,89.2,0
98.5,89.1,0
98.8,89,0
99.1,88.9,1
99.4,88.8,1
99.7,88.7,0
100,88.6,0
100.3,88.5,1
100.6,88.4,1
100.9,88.3,0
101.2,88.2,0
101.5,88.1,0
101.8,88,1
102.1,87.9,1
102.4,87.8,0
102.7,87.7,0
103,87.6,0
103.3,87.5,1
103.6,87.4,1
103.9,87.3,0
104.2,87.2,0
104.5,87.1,0
104.8,87,1
105.1,86.9,1
105.4,86.8,0
105.7,86.7,0
106,86.6,1
106.3,86.5,1
106.6,86.4,0
106.9,86.3,0
107.2,86.2,0
107.5,86.1,1
107.8,86,1
108.1,85.9,0
108.4,85.8,0
108.7,85.7,0
109,85.6,1
109.3,85.5,1
109.6,85.4,0
109.9,85.3,0
110.2,85.2,0
110.5,85.1,1

仅编辑脚本以添加所需的导入并打印一些输出:

import numpy as np
import csv

def sigmoid(x):
    return 1.0/(1+np.exp(-x)) 

def cost(x,y,th):
    pro = sigmoid(np.dot(x,th))
    result = sum(-y * np.log(pro) - (1-y) * np.log(1-pro))   
    result = result/len(x) #len: number of feature rows
    return result

def gradient(x,y,th):
    xTrans = x.transpose()                                      
    sig = sigmoid(np.dot(x,th))                              
    grad = np.dot(xTrans, ( sig - y ))                          
    grad = grad / len(x) #len: number of feature rows  
    return grad
def hessian(x,y,th):
    xTrans = x.transpose()                                      
    sig = sigmoid(np.dot(x,th))                              
    result = (1.0/len(x) * np.dot(xTrans, x) * np.diag(sig) * np.diag(1 - sig) )   
    return result
def updateTh(x,y,th):
    hessianInv = np.linalg.inv(hessian(x,y,th))                         
    grad = gradient(x,y,th)                                  
    th = th - np.dot(hessianInv, grad)                     
    return th

m = 80 #number of x rows
x = np.ones([m,3])
y = np.empty([m,1], dtype = int)
th = np.zeros([3,1])
hessianResult = np.identity(3) #identity 3x3


with open('exam.csv','r') as csvfile:
            i = 0
            reader = csv.reader(csvfile)
            next(reader) #skip header            
            for line in reader:
                x[i][1] = line[0]
                x[i][2] = line[1]
                y[i][0] = line[2]
                i+=1

#m = x.shape[0] #number of x rows
for i in range(40):
    print("Entry #",i,": ",x[i][1],", ",x[i][2],", ",y[i][0], sep = '')
    costResult = cost(x,y,th)
    print(costResult)
    hessianResult = hessian(x,y,th)
    print(hessianResult)
    grad = gradient(x,y,th)
    print(grad)
    th = updateTh(x,y,th)
    print(th)

如果您仅复制我的csv数据并将其用作测试输入,它应该可以在命令行(Windows)上运行并输出正常。因此,我可以看到的我所做的和您所做的唯一的区别-可能是错误的原因-是成绩数据文件本身的结构。