我是机器学习和Python的初学者。我试图在我的数据中实现SVM,比较MATLAB和Python。我希望得到类似的结果,但我没有。我试图找出与我类似的问题,但没有,或者我想念它们。
我有一套训练和测试集,它是二进制分类。 首先,我使用内置函数svmtrain和svmclassify在MATLAB中实现了。
load('sample.mat')
y_test = y_test';
y_train = y_train';
%% Training and Classification
svmStruct = svmtrain(X_train, y_train, 'boxconstraint', 0.1);
species = svmclassify(svmStruct, X_test);
%% Compute Accuracy
count = 0;
for i=1:length(X_test)
if species(i)==y_test(i)
count = count+1;
end
end
count/length(X_test)*100
%% Plotting
scatter(time, X_test, [], species)
hold on
y_test(y_test==1) = 0.03;
y_test(y_test==0) = -0.03;
plot(time, y_test, 'g')
以下是测试集的结果(A类按红色分类,B类按蓝色分类,绿线是A类的实际标签
其次,我使用scikit-learning包在Python中做了同样的事情。
import numpy as np
import matplotlib.pyplot as plt
import scipy.io as sio
from sklearn import svm
""" Load Data"""
LoadData = sio.loadmat('sample.mat')
X_test = LoadData['X_test'].reshape(-1, 1)
X_train = LoadData['X_train'].reshape(-1, 1)
y_test = LoadData['y_test'].T
y_train = LoadData['y_train'].T
time = LoadData['time']
""" Train Classifier """
clf = svm.LinearSVC(C=1.0)
clf.fit(X_train,y_train)
predict_X_test = clf.predict(X_test) # Predict test set
acc = clf.score(X_test, y_test) # Accuracy
""" Plotting """
plt.figure(1)
plt.figure(figsize=(15,8))
plt.scatter(time, X_test, c = predict_X_test)
y_test[y_test==1] = 0.03
y_test[y_test==0] = -0.03
plt.plot(time, y_test)
""" Plot Threshold """
w = clf.coef_
b = clf.intercept_
xx = np.linspace(300,610)
yy = -b/w*np.ones(xx.shape)
yy = yy.T
plt.plot(xx, yy, 'k-')
这是Python的结果。黑线是它的门槛。红线是A类的实际标签
通过使用Python,似乎阈值太高,即使我试图调整C参数,但它根本不起作用。
我现在不知道我的代码有什么问题。任何建议表示赞赏
修改
蓝线是Python的曲线。红线来自MATLAB。 x轴是[0.001 0.01 0.1 1 10 100]的C参数(我使用log(x)),y轴是精度。