Question

我想在R中使用神经网络来预测汽车的价格，有144个独立变量。在我的代码下面。一切都很好，除了最后两行：AUC和情节。

这是我得到的错误：

roc中的错误（predNN，yTEST）：没有足够的独特预测来计算ROC曲线下的面积。

我已将因变量计算为因子，但此错误仍然存在。我该如何解决这个问题？

 allind <- sample(x=1:nrow(data_price2),size=nrow(data_price2))

 trainind <- allind[1:round(length(allind)/3)]
 valind <- allind[(round(length(allind)/3)+1):round(length(allind)*(2/3))]
 testind <- allind[round(length(allind)*(2/3)+1):length(allind)]

 BasetableTRAIN <- data_price2[trainind,]
 BasetableVAL <- data_price2[valind,]
 Basetablebig <-rbind(BasetableTRAIN,BasetableVAL)
 BasetableTEST <- data_price2[testind,]

 #Create a separate response variable
 yTRAIN <- BasetableTRAIN$Price
 BasetableTRAIN$Price <- NULL

 yVAL <- BasetableVAL$Price
 BasetableVAL$Price <- NULL

 yTEST <- BasetableTEST$Price
 BasetableTEST$Price <- NULL

 yBIG <- Basetablebig$Price
 Basetablebig$Price <- NULL

 yTRAIN <- as.factor(yTRAIN)
 yVAL <- as.factor(yVAL)
 yTEST <- as.factor(yTEST)
 yBIG <- as.factor(yBIG)

 if (require("nnet")==FALSE) install.packages("nnet") ; library(nnet)
 if (require("AUC")==FALSE) install.packages("AUC") ; library(AUC)

 size <- 5 #number of units in the hidden layer
 decay <- 0.1 #weight decay. Same as lambda in regularized LR. Controls for
               overfitting. 
 rang <- 0.5 #the range of the initial random weights parameter
 maxit <- 2000 #set high in order not to run into early stopping 

 NN <- nnet(yBIG ~ ., Basetablebig, size = size, 
       rang = rang, decay = decay, maxit = maxit,MaxNWts= Inf)

 predNN <- as.numeric(predict(NN,BasetableTEST,type="raw"))
 AUC::auc(roc(predNN,yTEST))
 plot(roc(predNN,yTEST))

Answer 1

你很可能会遇到与糟糕模型相关的问题。研究模型的预测。根据概率阈值0.5，你可能会得到全0或1。神经网络技术非常容易受到不同列之间的尺度差异的影响，因此数据的标准化[mean = 0 std = 1]是一种很好的做法。我建议你使用R函数scale()。请提供数据以重现您的问题。

roc中的错误：R中的神经网络

1 个答案: