我尝试为某些RMS_features实现一些KNN分类,这些RMS_features是从某些传感器数据中提取的。 标记的传感器数据如下所示:
RMS_x RMS_y RMS_z RMS_euclidian labels
0.137221994086372451 0.141361458137922474 0.373367693426083891 0.422156809730974525 1
0.653967197231734354 0.523601431745291057 0.857427471986578205 1.19875494747598155 0
0.547301970096429224 0.510460963300706561 0.851980921284600901 1.13401116915058431 1
0.200317415034924756 0.137815296326320835 0.353579753893964288 0.429113930129869203 1
0.802069910360720617 0.752364652538367706 0.909861874144165417 1.42731122797950638 1
0.879041000013726426 0.746218766636731257 0.88728425792715937 1.45493260385191925 1
0.144637160351783728 0.117846411938445361 0.445677862167030925 0.483152607141023704 0
0.142457833655985133 0.0730350196404254831 0.287273765845172724 0.328868613593180703 0
0.0866202724953416131 0.0616184109162635982 0.266749047302988929 0.287149707309732383 1
0.839153663116914195 0.714433206853633651 0.785256227002287477 1.35322615235723642 0
0.112852384316477455 0.113895536346822021 0.298205076872631036 0.338576611298323393 1
1.03867993617356702 0.860906249377046295 0.826493656885982309 1.58212115367273398 1
1.08309298701834544 0.777872116663065438 0.107827834335941439 1.33783492638956725 0
0.269545256634713071 0.173020210546502379 0.396383770058648055 0.509618221610782407 0
2.82554170256769766 2.75559888003772846 2.72907654403846411 4.79842368740352843 0
0.956220220626555983 0.849082605233856036 1.16655931706066363 1.73094165732610805 0
0.393801166109265799 0.283932207763270439 0.591509176401210479 0.765231966661861884 0
0.809556622304495543 0.540659060535479075 0.909773758642383967 1.3324347775296399 0
我提取数据并在其上使用KNN的代码如下:
import pandas as pd
import numpy as np
from sklearn.model_selection import train_test_split
from sklearn.neighbors import KNeighborsClassifier
from sklearn.exceptions import NotFittedError
def file_read_in(filename):
df = pd.read_csv(filename, sep='\t', low_memory=False, skiprows=0) # use seperations character e.g '\t' ';'
data = df.apply(lambda x: pd.to_numeric(x), axis=0)
return data
def knn_alg(X_train, y_train, X_test, y_test, N):
knn = KNeighborsClassifier(n_neighbors=N)
knn.fit = (X_train, y_train)
try:
knn.predict(X_test)
except NotFittedError as e:
print(repr(e))
# print(knn.predict(X_test))
def split_dataset(X, y):
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=1, stratify=y)
return X_train, X_test, y_train, y_test
def main():
filename_labeled = "labeled_session_data/out/labeled.csv"
filename_unlabeled = "unlabeled_session_data/out/unlabeled.csv"
# column_nr = 4
data_labeled = file_read_in(filename_labeled)
data_unlabeled = file_read_in(filename_unlabeled)
X = data_labeled.drop(columns=['labels'])
y = data_labeled['labels'].values
X_train, X_test, y_train, y_test = split_dataset(X, y)
n_neighbors = 3
print("X_train " + "\n" + str(X_train))
print("X_test " + "\n" + str(X_test))
print("y_train " + "\n" + str(y_train))
print("y_test " + "\n" + str(y_test))
knn_alg(X_train, y_train, X_test, y_test, n_neighbors)
if __name__ == '__main__':
main()
首先,我将数据从csv文件提取到熊猫数据框。之后,我提取标签并将数据集拆分以进行训练和测试。在最后一步中,我想看看拟合的knn模型是否可以预测我的测试数据集,但是尽管拟合了数据,但是该模型引发了异常:
NotFittedError(“此KNeighborsClassifier实例尚未安装。在使用此方法之前,先使用适当的参数调用'fit'。”,)
我是否以错误的方式拟合数据?感谢您的帮助。
答案 0 :(得分:1)
您似乎不适合KNeighborsClassifier(例如,看一下Scikit-learn website上的示例)。
尝试一下:
def knn_alg(X_train, y_train, X_test, y_test, N):
knn = KNeighborsClassifier(n_neighbors=N)
knn.fit(X_train, y_train)
try:
knn.predict(X_test)
except NotFittedError as e:
print(repr(e))