数据和库

Question

我正在定义一个函数，以获取具有不同子组（子群体）调查数据的回归模型的预测值。我使用了调查包中的svyglm函数。

我的问题涉及在svyglm函数中处理subset选项。因为它使用非标准评估，所以我理解这意味着它不会将列名称作为字符串。我尝试仅使用不带字符串的列名，并用引号（enquo）和取消引号（!!）。但是，这两个选项均无效。我也玩过ensym（）和expr（），但没有得到任何结果。

数据和库

library(dplyr)
library(survey)
library(srvyr)
library(purrr)
library(rlang)

mtcars <- read.table("https://forge.scilab.org/index.php/p/rdataset/source/file/master/csv/datasets/mtcars.csv",
                     sep=",", header=TRUE)

mtcars_cplx <- mtcars %>% as_survey_design(id = cyl, weights = qsec)

carb <- c(1:8)
cyl <- c(4:8)
new_data <- expand.grid(carb, cyl)
colnames(new_data) <- c("carb", "cyl")

有夸张

功能和输入

subpop_pred <- function(formula, data, subpop, new_data) {

  subpop_quo <- enquo(subpop)
  subpop_txt <- data$variables %>% select(!!subpop_quo) %>% colnames()

  for(i in min(data$variables[subpop_txt]):max(data$variables[subpop_txt])){
    reg <- svyglm(formula, data, subset=!!subpop_quo==i)
    pred <- predict(reg, newdata=new_data)

    if(exists("reg_end")==TRUE){
      pred <- cbind(new_data, pred, confint(pred))
      pred[subpop_txt] <- i
      reg_end <- rbind(reg_end, pred)
    } else {
      reg_end <- cbind(new_data, pred, confint(pred))
      reg_end[subpop_txt] <- i
    }
  }
}

subpop_pred(mpg ~ carb + cyl + carb*cyl, 
            data=mtcars_cplx, 
            new_data=new_data,
            subpop=gear)

输出/错误

 Error: Base operators are not defined for quosures.
Do you need to unquote the quosure?

  # Bad:
  myquosure == rhs

  # Good:
  !!myquosure == rhs
Call `rlang::last_error()` to see a backtrace 
8. stop(cnd) 
7. abort(paste_line("Base operators are not defined for quosures.", 
    "Do you need to unquote the quosure?", "", "  # Bad:", bad, 
    "", "  # Good:", good, )) 
6. Ops.quosure(subpop_quo, i) 
5. eval(subset, model.frame(design), parent.frame()) 
4. eval(subset, model.frame(design), parent.frame()) 
3. svyglm.survey.design(formula, data, subset = !!subpop_quo == 
    i) 
2. svyglm(formula, data, subset = !!subpop_quo == i) 
1. subpop_pred(mpg ~ carb + cyl + carb * cyl, data = mtcars_cplx, 
    new_data = new_data, subpop = gear)

没有保证

功能和输入

subpop_pred <- function(formula, data, subpop, new_data) {

  subpop_quo <- enquo(subpop)
  subpop_txt <- data$variables %>% select(!!subpop_quo) %>% colnames()

  for(i in min(data$variables[subpop_txt]):max(data$variables[subpop_txt])){
    reg <- svyglm(formula, data, subset=subpop==i)
    pred <- predict(reg, newdata=new_data)

    if(exists("reg_end")==TRUE){
      pred <- cbind(new_data, pred, confint(pred))
      pred[subpop_txt] <- i
      reg_end <- rbind(reg_end, pred)
    } else {
      reg_end <- cbind(new_data, pred, confint(pred))
      reg_end[subpop_txt] <- i
    }
  }
}

subpop_pred(mpg ~ carb + cyl + carb*cyl, data=mtcars_cplx, new_data=new_data, subpop=gear)

输出

Error in eval(subset, model.frame(design), parent.frame()) : 
  object 'gear' not found 
5. eval(subset, model.frame(design), parent.frame()) 
4. eval(subset, model.frame(design), parent.frame()) 
3. svyglm.survey.design(formula, data, subset = subpop == i) 
2. svyglm(formula, data, subset = subpop == i) 
1. subpop_pred(mpg ~ carb + cyl + carb * cyl, data = mtcars_cplx, 
    new_data = new_data, subpop = gear)

您知道如何使该功能正常工作吗？

Answer 1

我可以通过混合使用subset和expr()使rlang::tidy_eval()参数起作用。

函数中的模型行将显示为：

reg <- svyglm(formula, data = data, 
       subset = rlang::eval_tidy( expr( !!subpop_quo == i), data =  data) )

但是，我不知道这种方法是否可靠，或者是否有一些更简单的方法来处理tidyeval。对此的工作使我意识到subset()函数/参数很难在函数中使用。 :-P

Answer 2

由于svyby()似乎不支持svyglm()，因此不确定是否有更好的方法。这里，quo_squash()用于将表达式传递到subset()中。可以扩展以进行预测。

gears = unique(mtcars$gear)
lapply(gears, function(x) {
  subset(mtcars_cplx, !!quo_squash(gear == x)) %>% 
    svyglm(mpg ~ carb + cyl + carb*cyl, design = .)
})

准报价上下文之外的取消报价

数据和库

有夸张

功能和输入

输出/错误

没有保证

功能和输入

输出

2 个答案: