Columns and first rows of code
我在geom_smooth(method="glm")
的同一geom_point
图中有几行不同的ggplot2
行。我正在寻找确定每条线的回归方程,包括斜率方程。我找到了similar post,但仍然遇到一些问题。我的代码是:
native <- read.csv("native.gather.C4C5C6C7.csv")
ggplot(native, aes(x=YearsPostRelease, y=PercentNative, col=FieldType, linetype=FieldType)) +
geom_point(size=0.7) +
geom_smooth(data = native,
method ="glm", alpha = 0, show.legend = FALSE, linetype = 'solid') +
scale_x_continuous(breaks = c(0, 5, 10, 15, 20, 25, 30, 35, 40, 45, 50, 55)) +
scale_y_continuous(limits = c(0, 100),
breaks = c(0, 10, 20, 30, 40, 50, 60, 70, 80, 90, 100)) +
ggtitle("Percent Native Through Time")
谢谢!
答案 0 :(得分:3)
这是使用here定义的lm_eqn
的方法。您可能遇到了问题,因为您的数据与函数的预期输入不匹配。由于没有您的数据,我在这里使用了mtcars
,以探讨 cyl 组之间的 mpg 和 wt 之间的关系。在下面,请注意我正在研究的关系的自定义。
lm_eqn <- function(df){
m <- lm(mpg ~ wt, df);
eq <- substitute(italic(mpg) == a + b %.% italic(wt)*","~~italic(r)^2~"="~r2,
list(a = format(coef(m)[1], digits = 2),
b = format(coef(m)[2], digits = 2),
r2 = format(summary(m)$r.squared, digits = 3)))
as.character(as.expression(eq));
}
我们可以将其应用于手动定义的数据子集。可能有一种更聪明的方法,可以更自动地将此方法应用于多个组,但是由于难以自动执行智能标签的位置,这可能就足够了。
library(ggplot2); library(dplyr)
ggplot(mtcars, aes(x=wt, y=mpg,
col=as.factor(cyl), linetype=as.factor(cyl))) +
geom_point() +
geom_smooth(data = mtcars,
method ="glm", alpha = 0, show.legend = FALSE, linetype = 'solid') +
annotate("text", x = 3, y = 30, label = lm_eqn(mtcars %>% filter(cyl == 4)), parse = TRUE) +
annotate("text", x = 4.3, y = 20, label = lm_eqn(mtcars %>% filter(cyl == 6)), parse = TRUE) +
annotate("text", x = 4, y = 12, label = lm_eqn(mtcars %>% filter(cyl == 8)), parse = TRUE)
答案 1 :(得分:0)
应用Jon所做的贡献,您可以按如下所示对此数据自定义此功能。
同样,很难完全了解基础数据的外观,但是假设您的字段 FieldType 包含三个因素:BSSFields,CSSFields,DSSFields。
# Load data
library(tidyverse)
native <- read.csv("native.gather.C4C5C6C7.csv")
# Define function
lm_eqn <- function(df){
m <- lm(PercentNative ~ YearsPostRelease, df);
eq <- substitute(italic(native) == a + b %.%
italic(YearsPostRelease)*","~~italic(r)^2~"="~r2,
list(a = format(coef(m)[1], digits = 2),
b = format(coef(m)[2], digits = 2),
r2 = format(summary(m)$r.squared, digits = 3)))
as.character(as.expression(eq));
}
# Plot data
ggplot(native, aes(x = YearsPostRelease,
y = PercentNative,
col = FieldType,
linetype = FieldType)) +
geom_point(size=0.7) +
geom_smooth(data = native,
method ="glm", alpha = 0, show.legend = FALSE, linetype = 'solid') +
scale_x_continuous(breaks = c(0, 5, 10, 15, 20, 25, 30, 35, 40, 45, 50, 55)) +
scale_y_continuous(limits = c(0, 100),
breaks = c(0, 10, 20, 30, 40, 50, 60, 70, 80, 90, 100)) +
annotate("text", x = 3, y = 30,
label = lm_eqn(native %>% filter(FieldType == "BSSFields")), parse = TRUE) +
annotate("text", x = 4, y = 20,
label = lm_eqn(native %>% filter(FieldType == "CSSFields")), parse = TRUE) +
annotate("text", x = 5, y = 10,
label = lm_eqn(native %>% filter(FieldType == "DSSFields")), parse = TRUE)
ggtitle("Percent Native Through Time")
请务必注意,这些回归方程的位置将根据 YearsPostRelease 和 PercentNative 的范围进行修改。另外,如果 FieldTypes 包含三个以上级别,则必须添加针对级别名称定制的相应annotate()
调用。