我们如何在图上打印线的等式?
我有2个自变量,想要这样的等式:
y=mx1+bx2+c
where x1=cost, x2 =targeting
我可以绘制最佳拟合线,但如何在图上打印等式?
也许我不能在一个等式中打印出2个自变量,但我怎么做呢
至少y=mx1+c
?
这是我的代码:
fit=lm(Signups ~ cost + targeting)
plot(cost, Signups, xlab="cost", ylab="Signups", main="Signups")
abline(lm(Signups ~ cost))
答案 0 :(得分:13)
我试着稍微自动化输出:
fit <- lm(mpg ~ cyl + hp, data = mtcars)
summary(fit)
##Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 36.90833 2.19080 16.847 < 2e-16 ***
## cyl -2.26469 0.57589 -3.933 0.00048 ***
## hp -0.01912 0.01500 -1.275 0.21253
plot(mpg ~ cyl, data = mtcars, xlab = "Cylinders", ylab = "Miles per gallon")
abline(coef(fit)[1:2])
## rounded coefficients for better output
cf <- round(coef(fit), 2)
## sign check to avoid having plus followed by minus for negative coefficients
eq <- paste0("mpg = ", cf[1],
ifelse(sign(cf[2])==1, " + ", " - "), abs(cf[2]), " cyl ",
ifelse(sign(cf[3])==1, " + ", " - "), abs(cf[3]), " hp")
## printing of the equation
mtext(eq, 3, line=-2)
希望它有所帮助,
亚历
答案 1 :(得分:3)
您使用?text。此外,您不应该使用abline(lm(Signups ~ cost))
,因为这是一个不同的模型(请参阅我在CV上的答案:Is there a difference between 'controling for' and 'ignoring' other variables in multiple regression)。无论如何,请考虑:
set.seed(1)
Signups <- rnorm(20)
cost <- rnorm(20)
targeting <- rnorm(20)
fit <- lm(Signups ~ cost + targeting)
summary(fit)
# ...
# Coefficients:
# Estimate Std. Error t value Pr(>|t|)
# (Intercept) 0.1494 0.2072 0.721 0.481
# cost -0.1516 0.2504 -0.605 0.553
# targeting 0.2894 0.2695 1.074 0.298
# ...
windows();{
plot(cost, Signups, xlab="cost", ylab="Signups", main="Signups")
abline(coef(fit)[1:2])
text(-2, -2, adj=c(0,0), labels="Signups = .15 -.15cost + .29targeting")
}
答案 2 :(得分:0)
这是使用 tidyverse
包的解决方案。
关键是broom
包,它简化了提取模型数据的过程。例如:
fit1 <- lm(mpg ~ cyl, data = mtcars)
summary(fit1)
fit1 %>%
tidy() %>%
select(estimate, term)
结果
# A tibble: 2 x 2
estimate term
<dbl> <chr>
1 37.9 (Intercept)
2 -2.88 cyl
我编写了一个函数来使用 dplyr
提取和格式化信息:
get_formula <- function(object) {
object %>%
tidy() %>%
mutate(
term = if_else(term == "(Intercept)", "", term),
sign = case_when(
term == "" ~ "",
estimate < 0 ~ "-",
estimate >= 0 ~ "+"
),
estimate = as.character(round(abs(estimate), digits = 2)),
term = if_else(term == "", paste(sign, estimate), paste(sign, estimate, term))
) %>%
summarize(terms = paste(term, collapse = " ")) %>%
pull(terms)
}
get_formula(fit1)
结果
[1] " 37.88 - 2.88 cyl"
然后使用 ggplot2
绘制线条并添加标题
mtcars %>%
ggplot(mapping = aes(x = cyl, y = mpg)) +
geom_point() +
geom_smooth(formula = y ~ x, method = "lm", se = FALSE) +
labs(
x = "Cylinders", y = "Miles per Gallon",
caption = paste("mpg =", get_formula(fit1))
)
这种绘制线条的方法实际上只对可视化两个变量之间的关系有意义。正如@Glen_b 在评论中指出的那样,我们从建模 mpg
作为 cyl
(-2.88) 的函数得到的斜率与我们从建模 mpg
得到的斜率不匹配cyl
和其他变量的函数(-1.29)。例如:
fit2 <- lm(mpg ~ cyl + disp + wt + hp, data = mtcars)
summary(fit2)
fit2 %>%
tidy() %>%
select(estimate, term)
结果
# A tibble: 5 x 2
estimate term
<dbl> <chr>
1 40.8 (Intercept)
2 -1.29 cyl
3 0.0116 disp
4 -3.85 wt
5 -0.0205 hp
也就是说,如果您想为包含未出现在图中的变量的模型准确绘制回归线,请改用 geom_abline()
并使用 broom
获取斜率和截距包函数。据我所知,geom_smooth()
公式不能引用尚未映射为美学的变量。
mtcars %>%
ggplot(mapping = aes(x = cyl, y = mpg)) +
geom_point() +
geom_abline(
slope = fit2 %>% tidy() %>% filter(term == "cyl") %>% pull(estimate),
intercept = fit2 %>% tidy() %>% filter(term == "(Intercept)") %>% pull(estimate),
color = "blue"
) +
labs(
x = "Cylinders", y = "Miles per Gallon",
caption = paste("mpg =", get_formula(fit2))
)