我正在尝试使用neuralnet
软件包在此data set上训练模型。但是,我收到了以下无法理解的错误:
错误:错误导数包含不适用值;证明导数函数不除以0(例如交叉熵)
这是我的代码:
library(neuralnet)
library(tidyverse)
framingham <- read_csv('https://courses.edx.org/assets/courseware/v1/7022cf016eefb6d3747447589423dab0/asset-v1:MITx+15.071x+3T2019+type@asset+block/framingham.csv',
col_types = cols(.default = 'i',sysBP = 'n', diaBP = 'n', BMI = 'n' ))
# Split data
set.seed(123); train_idx <- sample(nrow(framingham), 2/3 * nrow(framingham))
framingham_train <- framingham[train_idx, ]
framingham_test <- framingham[-train_idx, ]
# Binary classification
nn <- neuralnet(formula = TenYearCHD ~ ., data = framingham_train,
hidden=c(3,2),
act.fct = "tanh",
stepmax = 1e8,
err.fct = 'ce',
linear.output = TRUE)
我尝试更改错误功能和其他细节,但似乎无济于事。
答案 0 :(得分:0)
有一些带有NA的列:
summary(framingham)
male age education currentSmoker
Min. :0.0000 Min. :32.00 Min. :1.000 Min. :0.0000
1st Qu.:0.0000 1st Qu.:42.00 1st Qu.:1.000 1st Qu.:0.0000
Median :0.0000 Median :49.00 Median :2.000 Median :0.0000
Mean :0.4292 Mean :49.58 Mean :1.979 Mean :0.4941
3rd Qu.:1.0000 3rd Qu.:56.00 3rd Qu.:3.000 3rd Qu.:1.0000
Max. :1.0000 Max. :70.00 Max. :4.000 Max. :1.0000
NA's :105
cigsPerDay BPMeds prevalentStroke prevalentHyp
Min. : 0.000 Min. :0.00000 Min. :0.000000 Min. :0.0000
1st Qu.: 0.000 1st Qu.:0.00000 1st Qu.:0.000000 1st Qu.:0.0000
Median : 0.000 Median :0.00000 Median :0.000000 Median :0.0000
Mean : 9.006 Mean :0.02962 Mean :0.005896 Mean :0.3106
3rd Qu.:20.000 3rd Qu.:0.00000 3rd Qu.:0.000000 3rd Qu.:1.0000
Max. :70.000 Max. :1.00000 Max. :1.000000 Max. :1.0000
NA's :29 NA's :53
如果您对没有NA的行进行子集设置,则应该可以:
set.seed(123)
framingham = framingham[complete.cases(framingham),]
train_idx <- sample(nrow(framingham), 2/3 * nrow(framingham))
framingham_train <- framingham[train_idx, ]
framingham_test <- framingham[-train_idx, ]
此外,我认为您不能使用tanh
来进行交叉熵的激活,因此下面的方法可以将logistic
作为激活函数使用:
nn <- neuralnet(formula = TenYearCHD ~ ., act.fct="logistic",rep = 3,
data = framingham_train,hidden=c(3,2), err.fct = 'ce',linear.output=FALSE)