Keras神经网络中模型准确性极低且计算错误

时间:2020-07-21 21:28:56

标签: python r tensorflow machine-learning keras

我在Keras / tf中编写了一个神经网络模型,该模型计算随机二项式数据集的偏差和方差并将其平均,方差应该是偏差(对数似然比)的两倍,但是,我的模型是表现不佳。使用neuralnet使用RProp可以很好地工作,但是使用rmsprop(或SGD)在Keras中不起作用。而且,我的模型的准确度没有超过50%。

可复制代码:

library(reticulate)
library(keras)
library(tensorflow)
# library(caret)

use_python("C:/mini/envs/aiml3", required=T)
Sys.setenv(RETICULATE_MINICONDA_PATH = "C:/mini/envs/aiml3")

                                             ###DATA CREATION
input <- 2      #number of inputs into NN 
n <- 2000       #number of observations
ndata <- 500    #number of simulations

nvar <- input + 1 #number of inputs (x) plus one (y) e.g. 5+1=6 for formula y~sum(x_n)
row.names <- c(1:n)
column.names <- c(1:nvar)  
matrix.names <- c(1:ndata)

datas <- array(0,dim=c(n,nvar,ndata), dimnames = list(row.names, column.names, matrix.names))

for(i in 1:ndata){
  datas[,,i] = rbinom(n*nvar,1,0.5)
  dim(datas[,,i]) = c(n,nvar)
}

result <- matrix(nrow = ndata, ncol = 1) 

colnames(result) <- c("D")


# source_python("C:\\mini\\envs\\aiml3\\Lib\\site-packages\\tensorflow_core\\python\\keras\\optimizer_v2\\rprop.py")
# myopt = RProp(name="rprop")             #attempt to use rprop optimizer

model <- keras_model_sequential() %>%
  layer_dense(units = 2, activation = "sigmoid", input_shape = c(2)) %>% #logistic model, input: #1 hidden layer, 2 hidden units
  layer_dense(units = 1, activation = "sigmoid")                         #output

model %>% compile(
  optimizer = rmsprop,              #is this the best choice?
  loss = "binary_crossentropy",     #^
  metrics = c("accuracy")
)


for(i in 1:20){                     #change to 1:ndata, 20 is just for testing
  
  y <- datas[,nvar,i]
  n1 <- sum(y)                      #sum y as the third nth column of the ith matrix
  n0 <- n-n1
  
  seed <- 1020 + i
  set.seed(seed)
  
  currentmatrix <- datas[,,i]
  
  trainee <- currentmatrix[,-3] #change
  
  y <- as.matrix(y)
  
  history <- model %>% fit(
    trainee,
    y,
    epochs = 1,
    batch_size = 2000,
  )
  
  predictions <- model %>% predict(trainee) #currentmatrix[,1:nvar-1]
  
  argument <- y*log(predictions[,1])+(1-y)*log(1-predictions[,1])

  result[i] <- -2*((sum(argument))-(n1*log(n1)+n0*log(n0)-n*log(n))) #calculate and store deviance
  
}

df <- mean(result[,1], na.rm = T); print(var(result[,1], na.rm = T)); print(df); print((var(result[,1], na.rm = T))/df)                           #var/def should be ~2

####output####
print(n)
print(df)
print(var(result[,1], na.rm = T)) 
print(summary(result[,1]))

TL; DR:神经网络准确性差,并且统计信息错误。仅在Keras,不确定原因。怀疑优化器/激活/简单的问题。

0 个答案:

没有答案