我正在尝试使用Sci-Kit Learn创建一个SVM分类器,该分类器包含69个列,其中最后一列是输出。每列都有一个由一个欧氏距离组成的浮点数,而输出是一个整数。以下是书面代码:
import numpy as np
import os
import matplotlib.pyplot as plt
from sklearn.svm import SVC
from sklearn.model_selection import train_test_split
from numpy import float64
os.environ["PATH"] += os.pathsep + 'C:/Program Files (x86)/Graphviz2.38/bin/'
training=np.genfromtxt("Data/ck-train.csv", delimiter=",", skip_header=1)
model_svc = SVC(kernel='linear', probability=True, tol=1e-3)
score_list=[]
full_size=len(training)
tr_size=int(round(full_size*0.8,0))
ts_size=int(round(full_size*0.2))
print("Training Sample Size: %s samples " %tr_size)
print("Testing Sample Size: %s samples " %ts_size)
print("Total: ", full_size)
training_input=np.zeros((full_size,68))
training_output=np.zeros((full_size))
def create_sets():
np.random.shuffle(training)
for i in range(0,full_size):
training_input[i]=training[i][:-1]
training_output[i]=training[i][-1]
for i in range(0,full_size):
normTraining=np.linalg.norm(training[i][:-1])
training_input[i]=training[i][:-1]/normTraining
X_train, X_test, y_train, y_test = train_test_split(training_input, training_output, test_size = 0.20)
return X_train, X_test, y_train, y_test
### Training ###
def training_svm():
for i in range(0,10):
print("Starting training: round ", (i+1))
tr_in, ts_in, tr_out, ts_out = create_sets()
model_svc.fit(tr_in, tr_out)
print("Getting score: round ", (i+1))
score = model_svc.score(ts_in, ts_out)
print("Score: ", score)
score_list.append(score)
print("Mean score: ", np.mean(score_list))
training_svm()
我不知道为什么,但是当我尝试训练模型时,出现此错误:
File "sklearn\svm\libsvm.pyx", line 54, in sklearn.svm.libsvm.fit
ValueError: Buffer dtype mismatch, expected 'float64_t' but got 'float'
有人可以解释我所缺少的吗?预先感谢您的帮助。