我已经看到一些论文以智能的方式处理(不完全)回归分析中的残差,他们绘制垂直于拟合线的残差分布。 图2或图5中的示例图像(线性回归):https://www.nature.com/articles/nn.4538#results
我的R例子:
数据示例取自:https://www.r-bloggers.com/simple-linear-regression-2/
数据示例:
alligator = data.frame(
lnLength = c(3.87, 3.61, 4.33, 3.43, 3.81, 3.83, 3.46, 3.76,
3.50, 3.58, 4.19, 3.78, 3.71, 3.73, 3.78),
lnWeight = c(4.87, 3.93, 6.46, 3.33, 4.38, 4.70, 3.50, 4.50,
3.58, 3.64, 5.90, 4.43, 4.38, 4.42, 4.25)
)
线性回归模型:
reg <- lm(alligator$lnWeight ~ alligator$lnLength)
散点图:
plot(alligator,
xlab = "Snout vent length (inches) on log scale",
ylab = "Weight (pounds) on log scale",
main = "Alligators in Central Florida"
)
安装线:
abline(reg,col = "black", lwd = 1)
剩余分布(直方图):
hist(reg$residuals, 10, xaxt='n', yaxt='n', ann=FALSE)
我想在线性回归图的顶部插入直方图作为图2或图5中的示例图像(线性回归):https://www.nature.com/articles/nn.4538#results
感谢您的帮助。
答案 0 :(得分:2)
这将使残差直方图覆盖在主图上。你需要做一些工作才能使它垂直成角度,就像你引用的例子一样。
library("ggplot2")
theme_set(theme_minimal())
alligator = data.frame(
lnLength = c(3.87, 3.61, 4.33, 3.43, 3.81, 3.83, 3.46, 3.76,
3.50, 3.58, 4.19, 3.78, 3.71, 3.73, 3.78),
lnWeight = c(4.87, 3.93, 6.46, 3.33, 4.38, 4.70, 3.50, 4.50,
3.58, 3.64, 5.90, 4.43, 4.38, 4.42, 4.25)
)
reg <- lm(alligator$lnWeight ~ alligator$lnLength)
# make main plot, with best fit line (set se=TRUE to get error ribbon)
main_plot <- ggplot(alligator, aes(x=lnLength, y=lnWeight)) +
geom_point() + geom_smooth(method="lm", se=FALSE) +
scale_y_continuous(limits=c(0,7))
# create another plot, histogram of the residuals
added_plot <- ggplot(data.frame(resid=reg$residuals), aes(x=resid)) +
geom_histogram(bins=5) +
theme(panel.grid=element_blank(),
axis.text.y=element_blank(),
axis.text.x=element_text(),
axis.title.x=element_blank(),
axis.title.y=element_blank(),
axis.ticks.y=element_blank(),
axis.line.y=element_blank())
# turn the residual plot into a ggplotGrob() object
added_plot_grob <- ggplot2::ggplotGrob(added_plot)
# then add the residual plot to the main one as a custom annotation
main_plot + annotation_custom(grob=added_plot_grob,
xmin=4.0, xmax=4.35, ymin=1, ymax=5)
然后查看ggplot2::
和gridExtra::
的文档以确定轮播。希望这可以帮助!