我正在比较使用facet_grid通过ggplot2的geom_smooth执行的gam与使用mgcv ::: gam进行gam的输出(通过visreg可视化指定的“ by”因子)进行比较。随附数据和代码:
library(dplyr)
set.seed(1)
dat <- iris %>% mutate(response = sample(rep(c(0,1),length.out=150/2),150, replace=T))
#Just the output from geom_smooth
library(ggplot2)
ggplot(dat, aes(Sepal.Length,response)) +
geom_point() +
geom_smooth(method="gam", formula = y~s(x, bs="cs"), method.args=list("binomial")) +
facet_grid(.~Species)
#Now performing the gam through mgcv:::gam specifying by=Species
library(mgcv)
gam <- gam(dat, formula = response~s(Sepal.Length, bs="cs", by=Species),family = binomial())
#Comparing the two different outputs
library(visreg)
visreg(gam, "Sepal.Length", by="Species", scale="response", gg=T) +
guides(color=F)+
geom_smooth(data=dat,aes(Sepal.Length,response),
method="gam", formula = y~s(x, bs="cs"), method.args=list("binomial"), color="red", fill="green")
基本上,我认为发生的事情是,mgcv ::: gam的gam平滑基于某些类型的“估算”数据,而每个物种级别实际上都没有。看来geom_smooth()中的设置可以避免这种情况。有谁知道如何解决这个问题,以便geom_smooth和mgcv ::: gam的输出是相同的?
编辑:
根据user20650的回答,代码已更新为:
library(mgcv)
gam <- gam(dat, formula = response~Species + s(Sepal.Length, bs="cs", by=Species),family = binomial())
library(visreg)
visreg(gam, "Sepal.Length", by="Species", scale="response", gg=T) +
guides(color=F)+
geom_smooth(data=dat,aes(Sepal.Length,response),
method="gam", formula = y~s(x, bs="cs"), method.args=list("binomial"), color="red", fill="green",
fullrange=T)
从上图中可以看出,两种方法之间存在细微的差异(大部分在CI中)。如果我们看看例如set.seed(100):
library(dplyr)
set.seed(100)
dat <- iris %>% mutate(response = sample(rep(c(0,1),length.out=150/2),150, replace=T))
library(mgcv)
gam <- gam(dat, formula = response~Species + s(Sepal.Length, bs="cs", by=Species),family = binomial())
library(visreg)
visreg(gam, "Sepal.Length", by="Species", scale="response", gg=T) +
guides(color=F)+
geom_smooth(data=dat,aes(Sepal.Length,response),
method="gam", formula = y~s(x, bs="cs"), method.args=list("binomial"), color="red", fill="green",
fullrange=T)
谁能解释这两种方法的区别,以及如何从mgcv ::: gam中的geom_smooth()生成相同的输出,反之亦然(在fullrange = T和fullrange = F的情况下)? / p>