Question

您好，我有这段代码：

import pandas as pd
import numpy as np
import warnings
from sklearn import svm
warnings.filterwarnings(action="ignore", module="scipy", message="^internal gelsd")
from sklearn.model_selection import train_test_split

df = pd.read_csv("datatrain.csv" , sep="," ,encoding = 'windows-1250' )

df = df[['FEATURE1' ,  'FEATURE2' , 'FEATURE3' ,'LABEL']]

df.dropna(inplace=True)
print(df.head())

X = np.array(df.drop(['LABEL'], 1))
y = np.array(df['LABEL'])

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3)

clf = svm.SVC(kernel="linear", C= 1.0)
clf.fit(X_train[:-500], y_train[:-500])    

accuracy = clf.score(X_test, y_test)

print("accuracy: ", accuracy)

我的数据集很大，超过150K行，但是如你所见，我只使用前500行。当我启动我的代码时，第一个print(df.head())运行，但后来我的座位上只有一个弹跳的蟒蛇火箭，没有任何反应。

你能告诉我为什么会这样吗？谢谢！

Answer 1

您正在使用除最后500行之外的所有行。它应该是clf.fit(X_train[:500], y_train[:500])。

有关如何从切片中获取第n个元素的详细说明，请参阅此answer。

我的码头上只有一个弹跳的蟒蛇火箭，没有任何反应

1 个答案: