Question

在car包中，我尝试根据prestige，Prestige和income来预测名为education的数据集中名为type的响应变量。 lm函数的因子education。但在我填充数据之前，我想缩放income和Error: variables ‘income’, ‘I(income^2)’, ‘education’, ‘I(education^2)’ were specified with different types from the fit。如果您在R stuido中复制并运行下面的代码，控制台会说library(car) summary(Prestige) Prestige$education <- scale(Prestige$education) Prestige$income <- scale(Prestige$income) fit <- lm(prestige ~ income + I(income^2) + education + I(education^2) + income:education + type + type:income + type:I(income^2) + type:education + type:I(education^2)+ type:income:education, Prestige) summary(fit) pred <- expand.grid(income = c(1000, 20000), education = c(10,20),type = levels(Prestige $ type)) pred $ prestige.pred <- predict(fit, newdata = pred) pred

<ion-img>

如果不缩放预测变量，它就能成功运作。所以错误肯定是由于预测之前的缩放，我想知道如何解决这个问题？

Answer 1

请注意，scale()实际上会更改列的类。参见

class(car::Prestige$education)
# [1] "numeric"
class(scale(car::Prestige$education))
# [1] "matrix"

你可以安全地将它们简化为数字向量。您可以将c()的尺寸剥离属性用于此

Prestige$education <- c(scale(Prestige$education))
Prestige$income <- c(scale(Prestige$income))

然后我可以用

运行你的模型

fit <- lm(prestige ~ income + I(income^2) + education + I(education^2)
          + income:education + type + type:income + type:I(income^2) 
          + type:education + type:I(education^2)+ type:income:education,
          Prestige, na.action="na.omit")

并且预测返回

   income education type prestige.pred
1    1000        10   bc    -1352364.5
2   20000        10   bc  -533597423.4
3    1000        20   bc    -1382361.7
4   20000        20   bc  -534229639.3
5    1000        10 prof      398464.2
6   20000        10 prof   155567014.1
7    1000        20 prof      409271.3
8   20000        20 prof   155765754.7
9    1000        10   wc    -7661464.3
10  20000        10   wc -3074382169.9
11   1000        20   wc    -7634693.8
12  20000        20   wc -3073902696.6

另请注意，您可以使用

稍微简化配方

fit<-lm(prestige ~ (income + I(income^2) + education + I(education^2))*type +
          income:education + type:income:education, Prestige, na.action="na.omit")

这使用*创建了许多互动术语。

Answer 2

scale()添加了似乎会导致lm()出现问题的属性。使用

Prestige$education <- as.numeric(scale(Prestige$education))     
Prestige$education <- as.numeric(scale(Prestige$income))

让一切正常。

错误：使用不同的类型指定变量

2 个答案: