我在R中进行基于模拟的功率分析。我使用函数plyr::rdply
和lme4::glmer
运行R到RStudio(0.98.932)来生成数据并拟合模型,分别(参见下面R环境和包版本的可重现示例的结尾)。
该过程是随机生成给定参数化的数据集并使模型适合它。然而,时不时地,模型无法收敛。发生这种情况时会出现以下警告
[1]"无法评估缩放的渐变"
[2]"模型未能收敛:使用1个负特征值退化Hessian"
R进入浏览器模式,我必须手动干预(例如按c
)返回模拟循环。这是一个真正的痛苦,因为我需要在几天内运行数千次迭代,但每次遇到这个特定的收敛错误时它都会停止,直到我按下一个键。
有没有办法避免R进入浏览器模式?我存储了每个模拟中出现的所有警告,因此我遇到的唯一问题是,当发生这种特定的收敛故障时,我必须手动干预。我尝试过使用purrr::quietly
和purrr::safely
函数,但没有成功(请参阅下面的代码示例)。
这是一个可以在我的计算机上运行的MWE(我使用set.seed
进行再现性,所以我希望它可以独立于包版本等导致相同的结果)。在示例中,我使用相同的逻辑,但不同且更简单的参数化,正如我在实际模拟中使用的那样:
library(lme4)
library(plyr)
library(purrr)
# function to generate data that will lead to convergence failure
mini_simulator <- function() {
nb_items <- 10 # observations per subject
nb_subj <- 10 # subjects per group
generate_data <- function() {
A <- rbinom(nb_items * nb_subj, 1, .99)
B <- rbinom(nb_items * nb_subj, 1, .8)
simdata <- data.frame(
Group = rep(c("A", "B"), each = nb_items * nb_subj),
Subj = rep(1 : (nb_subj * 2), each = nb_items),
Items = 1:nb_items,
Response = c(A, B)
)
}
}
# Sanity check that the function is generating data appropriately.
# d should be a dataframe with 200 obs. of 4 variables
d <- mini_simulator()()
head(d, 3)
# Group Subj Items Response
# 1 A 1 1 1
# 2 A 1 2 1
# 3 A 1 3 1
rm(d)
## Functions to fit model
# basic function to fit model on simulated data
fit_model <- function(data_sim) {
fm <- glmer(
formula = Response ~ Group + (1|Subj) + (1|Items),
data = data_sim, family = "binomial")
out <- data.frame(summary(fm)$coef)
out
}
# similar but using purrr::quietly (also tried purrr::safely with no success)
# see http://r4ds.had.co.nz/lists.html section "Dealing with failure"
fit_model_quietly <- function(data_sim) {
purrr_out <- purrr::quietly(glmer)(
formula = Response ~ Group + (1|Subj) + (1|Items),
data = data_sim, family = "binomial")
fm <- purrr_out$result
out <- data.frame(summary(fm)$coef)
# keeps track of convergence failures and other warnings
out$Warnings <- paste(unlist(purrr_out$warnings), collapse = "; ")
out
}
# this seed creates the problematic convergence failure on the first evaluation
# of rdply
set.seed(2)
# When I run the next line R goes into Browse mode and I need to enter "c"
# in order to continue
simulations <- plyr::rdply(.n = 3, fit_model(mini_simulator()()))
simulations
# problem persists using the quietly adverb from purrr
set.seed(2)
simulations <- plyr::rdply(.n = 3, fit_model_quietly(mini_simulator()()))
simulations
# sessionInfo()
# R version 3.1.2 (2014-10-31)
# Platform: i386-w64-mingw32/i386 (32-bit)
#
# locale:
# [1] LC_COLLATE=Swedish_Sweden.1252 LC_CTYPE=Swedish_Sweden.1252 LC_MONETARY=Swedish_Sweden.1252
# [4] LC_NUMERIC=C LC_TIME=Swedish_Sweden.1252
#
# attached base packages:
# [1] stats graphics grDevices utils datasets methods base
#
# other attached packages:
# [1] purrr_0.2.1 plyr_1.8.1 lme4_1.1-8 Matrix_1.1-4
#
# loaded via a namespace (and not attached):
# [1] grid_3.1.2 lattice_0.20-29 magrittr_1.5 MASS_7.3-35 minqa_1.2.4 nlme_3.1-118 nloptr_1.0.4
# [8] Rcpp_0.11.3 splines_3.1.2 tools_3.1.2
在我的两台计算机options("error")
上产生
(function() { .rs.breakOnError(TRUE) })()
这似乎是某种RStudio的默认设置,实际上似乎让R在遇到stop()
调用时进入浏览器模式(我看到这可以通过菜单工具栏Debug&gt; On在图形界面中更改)错误&gt; ...)。无论如何,当我设置options(error = NULL)
时,问题就消失了。这是新的(简化的)示例,它可以很好地工作(在这个最小的例子中,也适用于实际的模拟):
library(lme4)
library(plyr)
library(purrr)
options(error=NULL)
## Function to generate data
# Generates data that will lead to convergence failure
mini_simulator <- function() {
nb_items <- 10 # observations per subject
nb_subj <- 10 # subjects per group
generate_data <- function() {
A <- rbinom(nb_items * nb_subj, 1, .99)
B <- rbinom(nb_items * nb_subj, 1, .8)
simdata <- data.frame(
Group = rep(c("A", "B"), each = nb_items * nb_subj),
Subj = rep(1 : (nb_subj * 2), each = nb_items),
Items = 1:nb_items,
Response = c(A, B)
)
}
}
## Function to fit model
# Fits model on simulated data with purrr::quietly to capture warnings
# (http://r4ds.had.co.nz/lists.html section "Dealing with failure")
fit_model_quietly <- function(data_sim) {
purrr_out <- purrr::quietly(glmer)(
formula = Response ~ Group + (1|Subj) + (1|Items),
data = data_sim, family = "binomial")
fm <- purrr_out$result
out <- data.frame(summary(fm)$coef)
# keeps track of convergence failures and other warnings
out$Warnings <- paste(unlist(purrr_out$warnings), collapse = "; ")
out
}
# this seed creates the problematic convergence failure on the first evaluation
# of rdply
set.seed(2)
simulations <- plyr::rdply(.n = 3, fit_model_quietly(mini_simulator()()))
simulations
答案 0 :(得分:2)
这听起来像options
的组合会触发警告和错误,以立即进入browser
调试器。
func <- function(type = "none") {
if (type == "warning") {
warning("impending doom")
} else if (type == "error") {
stop("doom")
}
type
}
func()
# [1] "none"
func("warning")
# Warning in func("warning") : impending doom
# [1] "warning"
func("error")
# Error in func("error") (from #5) : doom
两个相关选项,warn
和error
。有关详细信息,请参阅?options
。
options("warn")
# $warn
# [1] 1
这会将警告转换为错误:
options(warn=2)
func("warning")
# Error in func("warning") (from #3) : (converted from warning) impending doom
现在使用error
选项,我们可以对该错误采取措施:
options(warn=1, error=browser)
func("warning")
# Warning in func("warning") : impending doom
# [1] "warning"
func("error")
# Error in func("error") (from #5) : doom
# Browse[1]>
c
因此将警告转换为错误并捕获错误:
options(warn=2, error=browser)
func("warning")
# Error in func("warning") (from #3) : (converted from warning) impending doom
# Browse[1]>
c
我相信这就是发生在你身上的事。
Purrr
至于为什么purrr::quietly
似乎做了一些事情,我可以确认它绕过或忽略了从警告到错误的预期升级:
quietfunc <- quietly(func)
str(quietfunc("warning"))
# List of 4
# $ result : chr "warning"
# $ output : chr ""
# $ warnings: chr "impending doom"
# $ messages: chr(0)
options(warn=2, error=browser)
str(quietfunc("warning")) # no browser!
# List of 4
# $ result : chr "warning"
# $ output : chr ""
# $ warnings: chr "impending doom"
# $ messages: chr(0)
str(quietfunc("error")) # yes browser
# Error in .f(...) (from #5) : doom
# Browse[1]>
c
虽然safely
没有抓住函数中的简单警告(也不打算这样做):
options(warn=1)
str(safefunc("warning")) # warning is not "captured" by purrr::safely
# Warning in .f(...) : impending doom
# List of 2
# $ result: chr "warning"
# $ error : NULL
也许是this is a bug?