非参数回归ggplot

时间:2018-01-10 12:20:25

标签: r plot regression

我试图用ggplot2绘制一些非参数回归曲线。我用基础plot()函数实现了它:

library(KernSmooth)
set.seed(1995)

X <- runif(100, -1, 1)
G <- X[which (X > 0)]
L <- X[which (X < 0)]
u <- rnorm(100, 0 , 0.02)

Y <- -exp(-20*L^2)-exp(-20*G^2)/(X+1)+u


m <- lm(Y~X)
plot(Y~X)
abline(m, col="red")

m2 <- locpoly(X, Y, bandwidth = 0.05, degree = 0)
lines(m2$x, m2$y, col = "red")

m3 <- locpoly(X, Y, bandwidth = 0.15, degree = 0)
lines(m3$x, m3$y, col = "black")

m4 <- locpoly(X, Y, bandwidth = 0.3, degree = 0)
lines(m4$x, m4$y, col = "green")

legend("bottomright", legend = c("NW(bw=0.05)", "NW(bw=0.15)", "NW(bw=0.3)"),
       lty = 1, col = c("red", "black", "green"), cex = 0.5)

enter image description here

ggplot2已经实现了线性回归的绘制:

enter image description here

使用此代码:

ggplot(m, aes(x = X, y = Y)) +
  geom_point(shape = 1) +
  geom_smooth(method = lm,  se = FALSE) +
  theme(axis.line = element_line(colour = "black", size = 0.25))

但我不知道如何将其他线添加到此图中,如基本R图中所示。有什么建议?提前谢谢。

1 个答案:

答案 0 :(得分:1)

解决方案

最短的解决方案(虽然不是最漂亮的解决方案)是使用data=函数的geom_line参数添加行:

ggplot(m, aes(x = X, y = Y)) +
     geom_point(shape = 1) +
     geom_smooth(method = lm,  se = FALSE) +
     theme(axis.line = element_line(colour = "black", size = 0.25)) +
     geom_line(data = as.data.frame(m2), mapping  = aes(x=x,y=y))

美丽的解决方案

要获得漂亮的色彩和图例,请使用

# Need to convert lists to data.frames, ggplot2 needs data.frames
m2 <- as.data.frame(m2)
m3 <- as.data.frame(m3)
m4 <- as.data.frame(m4)
# Colnames are used as names in ggplot legend. Theres nothing wrong in using 
# column names which contain symbols or whitespace, you just have to use
# backticks, e.g. m2$`NW(bw=0.05)` if you want to work with them 
colnames(m2) <- c("x","NW(bw=0.05)")
colnames(m3) <- c("x","NW(bw=0.15)")
colnames(m4) <- c("x","NW(bw=0.3)")
# To give the different kernel density estimates different colors, they must all be in one data frame.
# For merging to work, all x columns of m2-m4 must be the same!
# the merge function will automatically detec columns of same name
# (that is, x) in m2-m4 and use it to identify y values which belong
# together (to the same x value)
mm <- Reduce(x=list(m2,m3,m4), f=function(a,b) merge(a,b)) 
# The above line is the same as:
#  mm <- merge(m2,m3)
#  mm <- merge(mm,m4)

# ggplot needs data in long (tidy) format
mm <- tidyr::gather(mm, kernel, y, -x)

ggplot(m, aes(x = X, y = Y)) +
         geom_point(shape = 1) +
         geom_smooth(method = lm,  se = FALSE) +
         theme(axis.line = element_line(colour = "black", size = 0.25)) +
         geom_line(data = mm, mapping  = aes(x=x,y=y,color=kernel))

plot with data points and all kernel density estimates

解决方案将为每个人和永恒解决这个问题

最美观,最可重复的方法是在ggplot2(see the included stats in ggplot)中创建自定义统计信息。

ggplot2团队有这个主题的小插图:Extending ggplot2。我从未进行过如此英勇的努力。