Logistic回归的目标函数是
,渐变是
其中w
是一个scipy的csr稀疏矩阵,其中d-by为1。
我的问题是,当我有一个scipy的csr稀疏矩阵和一个numpy数组时,分别是X_train
和y_train
。 (X_train
的每一行都是x_i
,y_train
的每个元素都是y_i
)
有没有更好的方法来计算梯度而不使用manully for loop?
有关详细信息,请执行large scale logistic regression。因此,表现很重要。
感谢。
更新5/19(添加我当前的代码)
感谢@ Jaime的提醒,这是我的代码。我基本上想看看是否有更好的方法来实现gradient(X, y, w)
。
import numpy as np
import scipy as sp
from sklearn import datasets
from numpy.linalg import norm
from scipy import sparse
eta = 0.01
xi = 0.1
C = 1
X_train, y_train = datasets.load_svmlight_file('lr/datasets/a9a')
X_test, y_test = datasets.load_svmlight_file('lr/datasets/a9a.t', n_features=X_train.shape[1])
def gradient(X, y, w):
# w should be a col vector
summation = w
for i in range(X.shape[0]):
exp_i = np.exp( y[i] * X.getrow(i).dot(w)[0, 0] )
summation = summation - (y[i] / (1 + exp_i)) * X.getrow(i).T
return summation
def hes_mul(X, D, s):
# w and s should be a col vector
# should return a col vector
return s + C * X.T.dot( D.dot( X.dot(s) ) )
def cg(X, y, w):
# gradF is col vector, so all of these are col vectors
gradF = gradient(X, y, w)
s = sparse.csr_matrix( np.zeros(X_train.shape[1]) ).T
r = -1 * gradF
d = r
D = []
for i in range(X.shape[0]):
exp_i = np.exp( (-1) * y[i] * w.T.dot(X.getrow(i).T)[0, 0] )
D.append(exp_i / ((1 + exp_i) ** 2))
D = sparse.diags(D, 0)
while True:
r_norm = np.sqrt((r.data ** 2).sum())
print r_norm
print np.sqrt((gradF.data ** 2).sum())
if r_norm <= xi * np.sqrt((gradF.data ** 2).sum()):
return s
hes_mul_d = hes_mul(X, D, d)
alpha = (r_norm ** 2) / d.T.dot( hes_mul_d )[0, 0]
s = s + alpha * d
r = r - alpha * hes_mul_d
beta = (r.data ** 2).sum() / (r_norm ** 2)
d = r + beta * d
w = sparse.csr_matrix( np.zeros(X_train.shape[1]) ).T
s = cg(X_train, y_train, w)