我正在定义一个函数,以获取具有不同子组(子群体)调查数据的回归模型的预测值。我使用了调查包中的svyglm函数。
我的问题涉及在svyglm函数中处理subset选项。因为它使用非标准评估,所以我理解这意味着它不会将列名称作为字符串。我尝试仅使用不带字符串的列名,并用引号(enquo)和取消引号(!!)。但是,这两个选项均无效。我也玩过ensym()和expr(),但没有得到任何结果。
library(dplyr)
library(survey)
library(srvyr)
library(purrr)
library(rlang)
mtcars <- read.table("https://forge.scilab.org/index.php/p/rdataset/source/file/master/csv/datasets/mtcars.csv",
sep=",", header=TRUE)
mtcars_cplx <- mtcars %>% as_survey_design(id = cyl, weights = qsec)
carb <- c(1:8)
cyl <- c(4:8)
new_data <- expand.grid(carb, cyl)
colnames(new_data) <- c("carb", "cyl")
subpop_pred <- function(formula, data, subpop, new_data) {
subpop_quo <- enquo(subpop)
subpop_txt <- data$variables %>% select(!!subpop_quo) %>% colnames()
for(i in min(data$variables[subpop_txt]):max(data$variables[subpop_txt])){
reg <- svyglm(formula, data, subset=!!subpop_quo==i)
pred <- predict(reg, newdata=new_data)
if(exists("reg_end")==TRUE){
pred <- cbind(new_data, pred, confint(pred))
pred[subpop_txt] <- i
reg_end <- rbind(reg_end, pred)
} else {
reg_end <- cbind(new_data, pred, confint(pred))
reg_end[subpop_txt] <- i
}
}
}
subpop_pred(mpg ~ carb + cyl + carb*cyl,
data=mtcars_cplx,
new_data=new_data,
subpop=gear)
Error: Base operators are not defined for quosures.
Do you need to unquote the quosure?
# Bad:
myquosure == rhs
# Good:
!!myquosure == rhs
Call `rlang::last_error()` to see a backtrace
8. stop(cnd)
7. abort(paste_line("Base operators are not defined for quosures.",
"Do you need to unquote the quosure?", "", " # Bad:", bad,
"", " # Good:", good, ))
6. Ops.quosure(subpop_quo, i)
5. eval(subset, model.frame(design), parent.frame())
4. eval(subset, model.frame(design), parent.frame())
3. svyglm.survey.design(formula, data, subset = !!subpop_quo ==
i)
2. svyglm(formula, data, subset = !!subpop_quo == i)
1. subpop_pred(mpg ~ carb + cyl + carb * cyl, data = mtcars_cplx,
new_data = new_data, subpop = gear)
subpop_pred <- function(formula, data, subpop, new_data) {
subpop_quo <- enquo(subpop)
subpop_txt <- data$variables %>% select(!!subpop_quo) %>% colnames()
for(i in min(data$variables[subpop_txt]):max(data$variables[subpop_txt])){
reg <- svyglm(formula, data, subset=subpop==i)
pred <- predict(reg, newdata=new_data)
if(exists("reg_end")==TRUE){
pred <- cbind(new_data, pred, confint(pred))
pred[subpop_txt] <- i
reg_end <- rbind(reg_end, pred)
} else {
reg_end <- cbind(new_data, pred, confint(pred))
reg_end[subpop_txt] <- i
}
}
}
subpop_pred(mpg ~ carb + cyl + carb*cyl, data=mtcars_cplx, new_data=new_data, subpop=gear)
Error in eval(subset, model.frame(design), parent.frame()) :
object 'gear' not found
5. eval(subset, model.frame(design), parent.frame())
4. eval(subset, model.frame(design), parent.frame())
3. svyglm.survey.design(formula, data, subset = subpop == i)
2. svyglm(formula, data, subset = subpop == i)
1. subpop_pred(mpg ~ carb + cyl + carb * cyl, data = mtcars_cplx,
new_data = new_data, subpop = gear)
您知道如何使该功能正常工作吗?
答案 0 :(得分:1)
我可以通过混合使用subset
和expr()
使rlang::tidy_eval()
参数起作用。
函数中的模型行将显示为:
reg <- svyglm(formula, data = data,
subset = rlang::eval_tidy( expr( !!subpop_quo == i), data = data) )
但是,我不知道这种方法是否可靠,或者是否有一些更简单的方法来处理tidyeval。对此的工作使我意识到subset()
函数/参数很难在函数中使用。 :-P
答案 1 :(得分:0)
由于svyby()
似乎不支持svyglm()
,因此不确定是否有更好的方法。这里,quo_squash()
用于将表达式传递到subset()
中。
可以扩展以进行预测。
gears = unique(mtcars$gear)
lapply(gears, function(x) {
subset(mtcars_cplx, !!quo_squash(gear == x)) %>%
svyglm(mpg ~ carb + cyl + carb*cyl, design = .)
})