R神经网络软件包:无法训练神经网络

时间:2020-03-27 01:17:27

标签: r machine-learning neural-network

我正在尝试使用neuralnet软件包在此data set上训练模型。但是,我收到了以下无法理解的错误:

错误:错误导数包含不适用值;证明导数函数不除以0(例如交叉熵)

这是我的代码:

library(neuralnet)
library(tidyverse)

framingham <- read_csv('https://courses.edx.org/assets/courseware/v1/7022cf016eefb6d3747447589423dab0/asset-v1:MITx+15.071x+3T2019+type@asset+block/framingham.csv',
                       col_types = cols(.default = 'i',sysBP = 'n', diaBP = 'n', BMI = 'n' ))
# Split data
set.seed(123); train_idx <- sample(nrow(framingham), 2/3 * nrow(framingham))
framingham_train <- framingham[train_idx, ]
framingham_test <- framingham[-train_idx, ]

# Binary classification
nn <- neuralnet(formula = TenYearCHD ~ ., data = framingham_train,
                hidden=c(3,2),
                act.fct = "tanh",
                stepmax = 1e8,
                err.fct = 'ce',
                linear.output = TRUE)

我尝试更改错误功能和其他细节,但似乎无济于事。

1 个答案:

答案 0 :(得分:0)

有一些带有NA的列:

summary(framingham)
      male             age          education     currentSmoker   
 Min.   :0.0000   Min.   :32.00   Min.   :1.000   Min.   :0.0000  
 1st Qu.:0.0000   1st Qu.:42.00   1st Qu.:1.000   1st Qu.:0.0000  
 Median :0.0000   Median :49.00   Median :2.000   Median :0.0000  
 Mean   :0.4292   Mean   :49.58   Mean   :1.979   Mean   :0.4941  
 3rd Qu.:1.0000   3rd Qu.:56.00   3rd Qu.:3.000   3rd Qu.:1.0000  
 Max.   :1.0000   Max.   :70.00   Max.   :4.000   Max.   :1.0000  
                                  NA's   :105                     
   cigsPerDay         BPMeds        prevalentStroke     prevalentHyp   
 Min.   : 0.000   Min.   :0.00000   Min.   :0.000000   Min.   :0.0000  
 1st Qu.: 0.000   1st Qu.:0.00000   1st Qu.:0.000000   1st Qu.:0.0000  
 Median : 0.000   Median :0.00000   Median :0.000000   Median :0.0000  
 Mean   : 9.006   Mean   :0.02962   Mean   :0.005896   Mean   :0.3106  
 3rd Qu.:20.000   3rd Qu.:0.00000   3rd Qu.:0.000000   3rd Qu.:1.0000  
 Max.   :70.000   Max.   :1.00000   Max.   :1.000000   Max.   :1.0000  
 NA's   :29       NA's   :53 

如果您对没有NA的行进行子集设置,则应该可以:

set.seed(123)
framingham = framingham[complete.cases(framingham),]
train_idx <- sample(nrow(framingham), 2/3 * nrow(framingham))
framingham_train <- framingham[train_idx, ]
framingham_test <- framingham[-train_idx, ]

此外,我认为您不能使用tanh来进行交叉熵的激活,因此下面的方法可以将logistic作为激​​活函数使用:

nn <- neuralnet(formula = TenYearCHD ~ ., act.fct="logistic",rep = 3,
data = framingham_train,hidden=c(3,2), err.fct = 'ce',linear.output=FALSE)