Question

我有以下变量：

prod：正整数

tenure：正数

cohort：因素

以下是一些具有这些规格的模拟数据。

set.seed(123)
my_data <- data.frame(prod   = rnbinom(10000, mu = 2.5, size = 1.5),
                      tenure = rexp(10000),
                      cohort = factor(sample(2011:2014, size = 10000, replace = TRUE,
                                             prob = c(0.17, 0.49, 0.26, 0.08))))

我使用mgcv:gam符合以下模型：

library(mgcv)
mod <- gam(prod ~ s(tenure, by = cohort) + cohort, data = my_data, family = nb())

获取预测及其标准错误：

preds   <- predict(mod, se.fit = TRUE)
my_data <- data.frame(my_data,
                      mu   = exp(preds$fit),
                      low  = exp(preds$fit - 1.96 * preds$se.fit),
                      high = exp(preds$fit + 1.96 * preds$se.fit))

使用package:ggplot2获取每个群组的平滑预测mu非常简单（同时也强制平滑器具有正值）：

library(magrittr)
library(ggplot2)
library(splines)
my_plot <-
  ggplot(my_data, aes(x = tenure, y = mu, color = cohort)) %>%
  + geom_smooth(method  = "glm",
                formula = y ~ ns(x, 3),
                family  = "quasipoisson",
                fill    = NA)

但我想让GAM的信心乐队变得平滑。我该如何添加？

不是答案

删除fill = NA。不。那些置信带将是无限小的，因为在一个群组中，任期预测完全相同。
致电geom_ribbon(aes(x = tenure, ymin = low, ymax = high))。不。这给了我一个超级摇摆，不平滑的信心乐队。
使用package:ggvis！除非package:ggvis无法执行此操作，否则请ggplot2回答。我目前的绘图框架是ggplot2，我现在仍然坚持使用它，除非我必须切换才能做这个情节。

Answer 1

这对我有用。

require(ggplot2)
require(mgcv)

set.seed(123)
my_data <- data.frame(prod   = rnbinom(10000, mu = 2.5, size = 1.5),
                      tenure = rexp(10000),
                      cohort = factor(sample(2011:2014, size = 10000, replace = TRUE,
                                             prob = c(0.17, 0.49, 0.26, 0.08))))
mod <- gam(prod ~ s(tenure, by = cohort) + cohort, data = my_data, family = nb())
preds   <- predict(mod, se.fit = TRUE)
my_data <- data.frame(my_data,
                      mu   = exp(preds$fit),
                      low  = exp(preds$fit - 1.96 * preds$se.fit),
                      high = exp(preds$fit + 1.96 * preds$se.fit))

ggplot(my_data, aes(x = tenure, y = prod, color = cohort)) +
  geom_point() + 
  geom_smooth(aes(ymin = low, ymax = high, y = mu), stat = "identity")

enter image description here

来自gam的ggplot置信区间预测$ fit并预测$ se.fit

1 个答案: