我对机器学习非常陌生。想用Coursera的课程训练自己。我正在使用Logistic回归进行" winequality-white"数据。数据样本如下
7 0.27 0.36 20.7 0.045 45 170 1.001 3 0.45 8.8 6
6.3 0.3 0.34 1.6 0.049 14 132 0.994 3.3 0.49 9.5 6
8.1 0.28 0.4 6.9 0.05 30 97 0.9951 3.26 0.44 10 6
7.2 0.23 0.32 8.5 0.058 47 186 0.9956 3.19 0.4 9.9 6
7.2 0.23 0.32 8.5 0.058 47 186 0.9956 3.19 0.4 9.9 6
值为6的最后一列实际上包含3到9的值表示葡萄酒的质量。列的其余部分包含决定葡萄酒质量的各种参数。
根据课程,我确实构建了返回J和Gradient的Cost函数。如下
function [J, grad] = lrCostFunction(theta, X, y, lambda)
m = length(y);
J = 0;
grad = zeros(size(theta));
h = sigmoid(X * theta); % hypothisis for logistic regression
J = (1/m) * sum( (-y .* log(h)) - ((1-y) .* log(1-h))) + (lambda/(2*m)) * sum(theta(2:length(theta)).^2);
grad(1) = (1 / m) * ( X'(1,:)) * (h-y);
grad(2:size(theta,1)) = (1/ m) * (X'(2:size(X',1),:)*(h-y) + ...
lambda*theta(2:size(theta,1),:));
grad = grad(:);
end
我从另一个函数调用此成本函数来查找Theta值
function [all_theta] = oneVsAll(X, y, num_labels, lambda)
% Some useful variables
m = size(X, 1);
n = size(X, 2);
% You need to return the following variables correctly
all_theta = zeros(num_labels, n + 1);
% Add ones to the X data matrix
X = [ones(m, 1) X];
for c = 1:num_labels
% % Set Initial theta
initial_theta = zeros(n + 1, 1);
%
% % Set options for fminunc
options = optimset('GradObj', 'on', 'MaxIter', 1000);
%
adc = c+2 ;
% % Run fmincg to obtain the optimal theta
% % This function will return theta and the cost
[theta] = ...
fmincg (@(t)(lrCostFunction(t, X, (y == adc), lambda)), ...
initial_theta, options);
all_theta(c,:) = theta';
end
end
在all_theta的帮助下,我正在预测值(我确实使用了训练和验证集)预测完全没有问题。有人可以帮助我在哪里犯错误。
这是我的第一个ML模型和有趣的部分,它给出准确率2%:) ..请帮助我朝正确的方向前进。提前致谢
注意: - 我正在使用Octave。
我也尝试使用Lambda值为0.1,0.01,0.001等。但没有运气:(
添加预测代码---
function p = predictOneVsAll(all_theta, X)
m = size(X, 1);
num_labels = size(all_theta, 1);
% You need to return the following variables correctly
p = zeros(size(X, 1), 1);
% Add ones to the X data matrix
X = [ones(m, 1) X];
z = X * all_theta';
A = sigmoid(z);
[t,p]=max(A,[],2);
end