在R中运行KNN,带有分类和数值预测变量

时间:2018-08-14 13:36:56

标签: r knn

出现以下错误:

Error in knn(train.x, test.x, train.y, k = 1) : 
NA/NaN/Inf in foreign function call (arg 6)

In addition: Warning messages:
1: In knn(train.x, test.x, train.y, k = 1) : NAs introduced by coercion
2: In knn(train.x, test.x, train.y, k = 1) : NAs introduced by coercion

产生此错误的代码如下:

test = 1:200
train.x = absentknn[-test,]
test.x = absentknn[test,]
train.y <- Target[-test]
test.y = Target[test]
set.seed(1)
knn.pred = knn(train.x,test.x,train.y,k=1)

数据帧是740行乘20的行,其中target为因变量。

其中test = 1:200 另外,也许有帮助,但是当我使用下面的代码并执行dim()时,我会得到空值。

train.y <- Target[-test]
test.y = Target[test]

输出如下:

structure(list(Reason.for.absence = structure(c(3L, 1L), .Label = c("Diseases", 
"Other", "W/o ICD"), class = "factor"), Month.of.absence = structure(c(8L, 
8L), .Label = c("0", "1", "2", "3", "4", "5", "6", "7", "8", 
"9", "10", "11", "12"), class = "factor"), Day.of.the.week = structure(c(2L, 
2L), .Label = c("2", "3", "4", "5", "6"), class = "factor"), 
    Seasons = structure(c(1L, 1L), .Label = c("1", "2", "3", 
    "4"), class = "factor"), Disciplinary.failure = structure(1:2, .Label = c("0", 
    "1"), class = "factor"), Education = structure(c(1L, 1L), .Label = c("1", 
    "2", "3", "4"), class = c("ordered", "factor")), Son = structure(c(3L, 
    2L), .Label = c("0", "1", "2", "3", "4"), class = "factor"), 
    Social.drinker = structure(c(2L, 2L), .Label = c("0", "1"
    ), class = "factor"), Social.smoker = structure(c(1L, 1L), .Label = c("0", 
    "1"), class = "factor"), Pet = structure(c(2L, 1L), .Label = c("0", 
    "1", "2", "4", "5", "8"), class = "factor"), Target = structure(c(1L, 
    1L), .Label = c("Below THreshold", "Above Threshold"), class = "factor"), 
    Transportation.expense = c(1.01072476745574, -1.54333530271458
    ), Distance.from.Residence.to.Work = c(0.429265332324082, 
    -1.12093537978198), Service.time = c(0.101700985294323, 1.24198475980643
    ), Age = c(-0.532508283408938, 2.09144557686699), Work.load.Average.day = c(-0.817659381760693, 
    -0.817659381760693), Hit.target = c(0.638254115594372, 0.638254115594372
    ), Weight = c(0.851097236884886, 1.4720604661625), Height = c(-0.019033134875066, 
    0.975168263304808), Body.mass.index = c(0.775407774938871, 
    1.00875537699449)), .Names = c("Reason.for.absence", "Month.of.absence", 
"Day.of.the.week", "Seasons", "Disciplinary.failure", "Education", 
"Son", "Social.drinker", "Social.smoker", "Pet", "Target", "Transportation.expense", 
"Distance.from.Residence.to.Work", "Service.time", "Age", "Work.load.Average.day", 
"Hit.target", "Weight", "Height", "Body.mass.index"), row.names = 1:2, class = "data.frame")

0 个答案:

没有答案