Question

因此，基本上，我生成了1000次X和Y随机变量，并创建了一个数据框Data = data.frame（x，y）以便通过样条函数进行平滑处理。现在，我想精确地重新创建它，但对于B = 1000次并绘制平滑函数（B = 1，...，1000）来比较其可变性

simulation= function(d){


  X=runif(1000,0,10)
  Y=rpois(1000,lambda=2*X+0.2*X*sin(X))
  Data=matrix(data=c(X,Y),ncol=2)
  smoothing_sim=lm(Y~ns(x=X,df=d),data=Data)
  new_x2=seq(min(X),max(X),length.out=100) 

  adjusted_sim=predict(object=smoothing_sim,newdata=data.frame(X=new_x2))
  return(data.frame(new_x2,smoothing_sim))

} 
simulation2=replicate(n=1000,simulation)

我不确定我的方法是否正确。而且我也不确定如何在仿真后绘制函数。有人愿意发表评论吗？谢谢！

Answer 1

如果使用ggplot，则可以在geom_smooth中进行平滑处理。由于ggplot需要长格式，因此使用列表列和tidyr::unnest是replicate的有用替代方法，尽管有很多方法可以完成数据生成步骤。

library(tidyverse)
set.seed(47)
# A nice theme with a white background to help make low-opacity objects visible
theme_set(hrbrthemes::theme_ipsum_tw())

df <- tibble(replication = seq(100),    # scaled down a little
             x = map(replication, ~runif(1000, 0, 10)), 
             y = map(x, ~rpois(1000, lambda = 2*.x + 0.2*.x*sin(.x)))) %>% 
    unnest() 

# base plot with aesthetics and points
point_plot <- ggplot(df, aes(x, y, group = replication)) + 
    geom_point(alpha = 0.01, stroke = 0) 

point_plot + 
    geom_smooth(method = lm, formula = y ~ splines::ns(x), size = .1, se = FALSE)

控制线的alpha值对于这种绘图确实很有帮助，但是alpha中的geom_smooth参数控制标准错误功能区的不透明度。要设置该行的Alpha，请将geom_line与stat_smooth结合使用：

point_plot + 
    stat_smooth(geom = 'line', method = lm, formula = y ~ splines::ns(x), 
                color = 'blue', alpha = 0.03)

目前，平滑功能在这里所做的并不比OLS多得多。为了使其更加灵活，请设置自由度：

point_plot + 
    stat_smooth(geom = 'line', method = lm, formula = y ~ splines::ns(x, df = 5), 
                color = 'blue', alpha = 0.03)

鉴于响应为Poisson，可能值得用glm扩大到Poisson回归。这里最大的影响是，当x很小时，y不会一直下降到0：

point_plot + 
    stat_smooth(geom = 'line', method = glm, method.args = list(family = 'poisson'), 
                formula = y ~ splines::ns(x, df = 5), color = 'blue', alpha = 0.03)

根据需要进一步调整。

如何复制功能1000次

1 个答案: