我想通过使用Neuralnet包预测存储在csv文件中的新数据。我只是觉得很奇怪,对于所有预测的数据,输出为我提供了相同的常数值。我不知道我是否做得好。我的网络的准确度是95.3%
我的训练数据具有以下结构:
> head(dataset)
POS Freq MAF Coverage protdesc Rendu
1 212812097 40.5 1 3994 1 NO
...
3 212812095 50.5 0 4000 0 YES
我的班级人渡有两个值否或是。 并且我想在我的csv文件中预测Rendu,该文件具有以下结构
POS Freq MAF Coverage protdesc
1 25398280 24.4 0 1962 0
这是我的代码:
library(neuralnet)
library(caret)
library(reshape2)
library(ggplot2)
library(stringr)
# define the filename
filename <- "/home/rico/Documents/TrainingMutNGS.csv"
# load the CSV file from the local directory
dataset <- read.csv(filename, header=FALSE)
# set the column names in the dataset
colnames(dataset) <- c("POS","Freq","MAF","Coverage","protdesc","Rendu")
str(dataset)
# Split into Train and Validation sets
# Training Set : Validation Set = 70 : 30 (random)
set.seed(100)
train <- sample(nrow(dataset), 0.7*nrow(dataset), replace = FALSE)
TrainSet <- dataset[train,]
ValidSet <- dataset[-train,]
summary(TrainSet)
summary(ValidSet)
dataset[,1:5] <- scale(dataset[,1:5])
dataset <- cbind(dataset, model.matrix(~ 0 + Rendu, dataset))
trainInds <- sample(1:dim(dataset)[1],8550)
#Get the indeces for the test data
testInds <- setdiff(1:11400, trainInds)
#Get the variable names for the input and output
predictorVars <- names(dataset)[1:5]
outcomeVars <- names(dataset)[7:8]
#Paste together the formula
modFormula <- as.formula(paste(paste(outcomeVars, collapse = "+"), "~", paste(predictorVars, collapse = " + ")))
modFormula
MutNet <- neuralnet(formula = modFormula, data=dataset[trainInds,],hidden = c(4), act.fct = "logistic",linear.output = FALSE, lifesign = "minimal")
classes <- neuralnet::compute(MutNet, dataset[testInds,-c(6:8)])
classRes <- classes$net.result
nnClass <- apply(classRes, MARGIN = 1, which.max)
origClass <- apply(dataset[testInds, c(7:8)], MARGIN = 1, which.max)
paste("The classification accuracy of the network is", round(mean(nnClass == origClass) * 100, digits = 2), "%")
#Load csv file for prediction
library(readr)
Variants <- read.table(("/home/rico/Documents/test.csv"), sep=";", header=T, dec=".", fill=TRUE)
predi <- neuralnet::compute(MutNet,Variants[,1:5])
predi
我知道我包含在测试中的所有值都应为“ YES”,但是predi的输出为我提供了所有相同的值,我觉得很奇怪,有什么想法吗? $ net.result
[,1] [,2]
[1,] 0.953410406 0.04689899119
[2,] 0.953410406 0.04689899119
[3,] 0.953410406 0.04689899119
[4,] 0.953410406 0.04689899119
[5,] 0.953410406 0.04689899119
[6,] 0.953410406 0.04689899119
[7,] 0.953410406 0.04689899119
[8,] 0.953410406 0.04689899119
[9,] 0.953410406 0.04689899119
[10,] 0.953410406 0.04689899119