我一直试图使用KNN函数来开始预测,但是当我运行代码时,它将引发错误:
knn(data.frame(tr5_train),data.frame(tr5_test),cl = pred_train_labels 、:'train'和'class'的长度不同
我检查了所有数据集是否为data.frame,并尝试将标签用作向量,但没有成功
以下是我使用的代码:
test_tr5_no_target<- test_tr5[-2]
tr5_train<- test_tr5_no_target[1:74475, , drop = FALSE]
tr5_test<- test_tr5_no_target[74476:93094, , drop = FALSE]
pred_train_labels<- test_tr5[1:74475, 2]
pred_test_labels<- test_tr5[74476:93094, 2]
#install.packages("class")
library(class)
##ensure all data is a dataframe
as.data.frame(tr5_train)
as.data.frame(tr5_test)
as.data.frame(pred_train_labels)
pred1<- knn(data.frame(tr5_train), data.frame(tr5_test), cl = pred_train_labels, k = 5)
请记住,标签列2是数字目标功能。我已经进行了全面的研究,但未能找到引发此错误的原因,是否有我做错的事情?
感谢所有帮助,非常感谢! (不幸的是,由于受限制,我无法共享数据本身)
-Jose C.
答案 0 :(得分:1)
要直接回答您的问题:您希望标签(这里是mtcars
)是矢量,而不是数据框。我们可以使用library('tidyverse')
library('class')
set.seed(1)
x <- mtcars
target <- x[-1]
size <- floor(0.75 * nrow(x))
train_ind <- sample(seq_len(nrow(x)), size = size)
train <- x[train_ind, ]
test <- x[-train_ind, ]
label <- as.data.frame(x[1][train_ind, ]) #problem is here
test <- knn(train,test,cl = label, k = 5)
test
Error in knn(train, test, cl = label, k = 5) :
'train' and 'class' have different lengths
数据集来重新创建您的错误。
train_ind <- sample(seq_len(nrow(x)), size = size)
train <- x[train_ind, ]
test <- x[-train_ind, ]
label <- x[1][train_ind, ] #NOT a dataframe
test <- knn(train,test,cl = label, k = 5, prob = TRUE)
attributes(test)
$`levels`
[1] "10.4" "14.3" "14.7" "15" "15.2" "15.8" "16.4" "17.3"
[9] "17.8" "18.7" "19.2" "19.7" "21" "21.4" "22.8" "24.4"
[17] "26" "30.4" "32.4"
通过允许标签成为向量,然后从新的knn对象调用属性,我们可以获得输出:
??knn
浏览ViewController
中的示例也显示了这一点。