在R中创建多个Bias-Variance权衡图

时间:2018-02-20 14:58:08

标签: r plot model statistics variance

我对R比较陌生。我想知道如何创建以下图形。我被困了两个多小时。

enter image description here

假设红线 - 真实关系 - 是y = x ^ 2。 假设我想将100个线性模型拟合到100个随机样本(蓝线)。

我该怎么做?到目前为止,这就是我所拥有的:

# create the true relationship
f <- function(x) x^2  # true model
x <- seq(0, 1, by = 0.01)
y <- f(x)

# plot the true function
plot(x, y, type = "l", col = "red", ylim = c(-0.2, 1.2), lwd = 4)

# fit 100 models
set.seed(1)
for (i in 1:100)
{
    errors <- rnorm(n, 0, sigma)       # random errors, have standard deviation sigma
    obs_y <- f(obs_x) + errors         # observed y = true_model + error
    model <- lm(obs_y ~ obs_x)         # fit a linear model to the observed values
    points(obs_x[i], mean(obs_y[i]), col = "green")       # mean values
    abline(model, col = "purple")    # plot the fitted model
}

创建了这个:

enter image description here

其中绿点肯定是关闭的...... 而且我没有黑点......

谢谢!

2 个答案:

答案 0 :(得分:3)

以下是经过多次调整后的代码:

f <- function(x) x^2
x <- seq(0, 1, by = 0.05)
n <- length(x)
sigma <- 0.05
y <- f(x)

plot(x, y, type = "l", col = "red", ylim = c(-0.2, 1.2), lwd = 2)

fitted <- ys <- matrix(0, ncol = n, nrow = 100)

set.seed(1)
for (i in 1:100)
{
  errors <- rnorm(n, 0, sigma)
  ys[i, ] <- obs_y <- f(x) + errors
  model <- lm(obs_y ~ x)
  fitted[i, ] <- fitted(model)
  abline(model, col = "purple", lwd = 0.1)
}

points(x = rep(x, each = 100), y = ys, cex = 0.1)
points(x = x, y = colMeans(fitted), col = 'green', cex = 0.3)

enter image description here

答案 1 :(得分:3)

使用ggplot,例如

library(ggplot2)

x <- seq(0, 1, 0.1)
y <- x^2

dat <- as.data.frame(do.call(rbind, lapply(1:100, function(i){
  y_err <- y + rnorm(1, 0, 0.06)
  l <- lm(y_err ~ x)$coefficients
  cbind(samp = i, intercept = l[1], slope = l[2], t(x * l[2] + l[1]), t(y_err))
})), row.names = 1:100)

ggplot() +  
  geom_abline(aes(intercept = dat$intercept, slope = dat$slope)) + 
  geom_point(aes(x = rep(x, each = 100), y = unlist(dat[, 15:25])), alpha = 0.5) +
  geom_line(aes(x = x, y = y), color = "red", lwd = 2) +
  geom_point(aes(x = x, y = colMeans(dat[, 4:14])), color = "green")

enter image description here