在Scikit-Learn中使用非线性SVM时出错

时间:2016-02-05 10:12:29

标签: python machine-learning scikit-learn

我有一个代码尝试使用Non Linear SVM(RBF内核)。

#config/routes.rb
resources :properties # -> url.com/properties/new

#app/controllers/properties_controller.rb
class PropertiesController < ApplicationController
  before_action :authenticate_user! #-> assuming you're using devise

  def new
    @property = current_user.properties.new
  end

  def create
    @property = current_user.properties.new property_params
    respond_to do |format|
      if @property.save
        format.html { redirect_to @property, notice: 'Property was successfully created.' }
        format.json { render :show, location: @property}
      else
        format.html { render :new }
        format.json 
      end
    end
  end

  private

  def property_params
     params.require(:property).permit(:street, :city, :province, :postal_code, :description)
  end
end

#app/views/properties/new.html.erb
<%= form_for @property do |f| %>
   <%= f.text_field :street %>
   <%= f.text_field :city %>
   <%= f.text_field :province %>
   <%= f.text_field :postal_code %>
   <%= f.text_field :description %>
   <%= f.submit %>
<% end %>

然而,当我尝试适应时,我收到错误

raw_data1 = open("/Users/prateek/Desktop/Programs/ML/Dataset.csv")
raw_data2 = open("/Users/prateek/Desktop/Programs/ML/Result.csv")

dataset1 = np.loadtxt(raw_data1,delimiter=",")
result1 = np.loadtxt(raw_data2,delimiter=",")

clf = svm.NuSVC(kernel='rbf')
clf.fit(dataset1,result1)

Link for Results.csv

Link for dataset

出现这种错误的原因是什么?

1 个答案:

答案 0 :(得分:1)

如文档中所指出的,nu参数是“训练误差分数的上限和支持向量分数的下限”。

因此,每当您尝试拟合数据并且无法满足此限制时,优化问题就变得不可行了。因此你的错误。

事实上,我从1.循环到0.1(以十进制单位递减)但仍然出现错误,然后尝试使用0.01没有投诉产生即可。但是,当然,您应该检查使用该值拟合模型的结果,检查预测的准确度是否可接受。

  

更新:实际上我很好奇并将您的数据集拆分为验证,输出 69%准确度(我认为您的训练集可能很少)

为了重复性目的,这里,我进行了快速测试:

from sklearn import svm
import numpy as np 
from sklearn.cross_validation import train_test_split
from sklearn.metrics import accuracy_score

raw_data1 = open("Dataset.csv")
raw_data2 = open("Result.csv")
dataset1 = np.loadtxt(raw_data1,delimiter=",")
result1 = np.loadtxt(raw_data2,delimiter=",")

clf = svm.NuSVC(kernel='rbf',nu=0.01)
X_train, X_test, y_train, y_test = train_test_split(dataset1,result1, test_size=0.25, random_state=42)
clf.fit(X_train,y_train)
y_pred = clf.predict(X_test)
accuracy_score(y_test, y_pred, normalize=True, sample_weight=None)