相反的错误:按因子分组,而“ formula()”中的减号运算符停止工作

时间:2019-02-21 22:36:56

标签: r dplyr formula lm

在某个因素上使用group_by()时会发生错误,即使此因素之后是removed from the model using the minus operator (-)。我激励人心的例子:

library(tidyverse)
df = mtcars %>% mutate(am = factor(am))
fits = df %>%
  group_by(am) %>%
  do(fit = lm(formula(mpg ~ . - am), .)) # Returns the error

哪个给出以下错误消息:

  

`contrasts <-`(` tmp `中的错误,value = contr.funs [1 + isOF [nn]]):对比度只能应用于具有2个或更多水平的因子< / p>

如果我filter()而不是组,我也会遇到相同的错误

fit_am0 = df %>% 
  filter(am == 0) %>%
  lm(formula(mpg ~ . - am), .) # Returns the error

当我尝试删除的变量是一个因素(即两者的组合)时,好像formula()函数未正确检测到负运算符(- am)。这是我的猜测,因为以下示例可以正常工作:

fits = mtcars %>% # `am` is numeric
  group_by(am) %>%
  do(fit = lm(formula(mpg ~ . - am), .)) # No error
fit_am0 = df %>%
  filter(am == 0) %>%
  select(-am) %>% # `am` removed prior to running model
  lm(formula(mpg ~ .), .) # No error
fits2 = mtcars %>% 
  mutate(vs = factor(vs)) %>% # A non-grouped factor, later removed
  group_by(am) %>%
  do(fit = lm(formula(mpg ~ . - vs), .)) # No error

这是一个错误吗?还是我在激励人心的例子中犯了错误?

1 个答案:

答案 0 :(得分:0)

我找到了解决方案。在数据选项而不是公式选项(即lm(formula = formula(mpg ~ .), data = select(., -am)))中删除因子。

library(tidyverse)
df = mtcars %>% mutate(am = factor(am))
fits = df %>%
  group_by(am) %>%
  do(fit = lm(
    formula(mpg ~ .), 
    select(., -am)
  )) # No error