我有关于保险的数据;年龄,性别,BMI,儿童,吸烟者,地区和收费。性别,吸烟者和地区是因素。性别:男,女,吸烟者:是,否,地区:东北,东南,西南,西北。
m2 <- lm(charges ~ age + sex + bmi + children + smoker + region)
在用数据拟合线性回归模型之后,我需要预测:男性,年龄= 40,bmi = 30,吸烟者=是,区域=西北。 读取数据后,我尝试分解分类变量
data$sex <- as.factor(data$sex)
data$region <- as.factor(data$region)
使用预测功能:
predict(m2, list(age=40, sex=factor(male), bmi=30, children=2, smoker=factor(yes),
region=factor(northwest)), int="p", level=0.98)
我只会得到错误。请帮忙
答案 0 :(得分:0)
代替重新定义因素,只需在predict
的引号中使用因素级别即可。
predict(m2, list(age=40, sex="male", bmi=30, children=2, smoker="yes",
region="northwest"), int="p", level=0.98)
# fit lwr upr
# 1 -1.978994 -9.368242 5.410254
数据
dat <- structure(list(charges = c(1.37095844714667, -0.564698171396089,
0.363128411337339, 0.63286260496104, 0.404268323140999, -0.106124516091484,
1.51152199743894, -0.0946590384130976, 2.01842371387704, -0.062714099052421
), age = c(20L, 58L, 44L, 53L, 22L, 51L, 20L, 75L, 59L, 41L),
sex = structure(c(2L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L), .Label = c("female",
"male"), class = "factor"), bmi = c(25.3024309248682, 24.6058854935878,
25.7881406228236, 25.6707038267505, 24.0508191903124, 25.036135738485,
27.115755613237, 25.1674409043556, 24.1201634714689, 25.9469131749433
), children = c(4L, 1L, 5L, 1L, 1L, 4L, 0L, 0L, 3L, 4L),
smoker = c("no", "yes", "yes", "no", "no", "yes", "yes",
"yes", "yes", "no"), region = structure(c(1L, 2L, 2L, 3L,
1L, 2L, 3L, 3L, 3L, 2L), .Label = c("northeast", "northwest",
"southeast"), class = "factor")), row.names = c(NA, -10L), class = "data.frame")