问题是,这是在mgcv中gam中的平滑样条中指定结的正确方法吗?
混乱的部分是在小插图中,它表示k是用来表示平滑项的基础的维数。
(以前我认为在“ cr”设置中,基数为3。阅读第149-150页(GAM,R的介绍)后,该游戏似乎使用了一组k基数。编写三次回归样条。)
但是,在下面的文章中,它表明k实际上是节数。这由下面的代码验证
# reference
# https://stackoverflow.com/questions/40056566/mgcv-how-to-set-number-and-or-locations-of-knots-for-splines
library(mgcv)
## toy data
set.seed(0); x <- sort(rnorm(400, 0, pi)) ## note, my x are not uniformly sampled
set.seed(1); e <- rnorm(400, 0, 0.4)
y0 <- sin(x) + 0.2 * x + cos(abs(x))
y <- y0 + e
## fitting natural cubic spline
cr_fit <- gam(y ~ s(x, bs = 'cr', k = 20))
cr_knots <- cr_fit$smooth[[1]]$xp ## extract knots locations
par(mfrow = c(1,2))
plot(x, y, col= "blue", main = "natural cubic spline");
lines(x, cr_fit$linear.predictors, col = 2, lwd = 2)
abline(v = cr_knots, lty = 2)
然后,要使用平滑样条,我是否应该在gam参数中手动指定结点?尝试的代码如下:
## fitting natural cubic spline, smoothing spline
cr_fit <- gam(y ~ s(x, bs = 'cr', k = length(x)), knots=list(x))
cr_knots <- cr_fit$smooth[[1]]$xp ## extract knots locations
## summary plot
par(mfrow = c(1,2))
plot(x, y, col= "blue", main = "natural cubic spline");
lines(x, cr_fit$linear.predictors, col = 2, lwd = 2)
abline(v = cr_knots, lty = 2)
plot(x,cr_knots)
cr_fit$sp
这种理解正确吗?
如果是,那么如何在mgcv中使用gam实现平滑样条线方法?