查看plyr教程,我发现以下内容 准备:
b2 <- ddply(baseball, "id", transform, cyear = year - min(year) + 1)
b2 <- ddply(b2, "id", transform, career = (cyear - 1) / max(cyear))
bruth <- subset(b2, id == "ruthba01")
# Could we model that as two straight lines?
bruth$p <- (bruth$career - 0.5) * 100
现在有些模特
mod <- lm(g ~ p + p:I(p > 0), data = bruth)
有什么不同?
mod <- lm(g ~ p + I(p > 0), data = bruth)
当我检查时
mod$model
在这两种情况下,它都会产生相同数字的相同列 但回归系数完全不同......
这个符号的含义是什么意思?
答案 0 :(得分:1)
运行以下代码以查看不同模型的影响:
with(bruth, plot(p, predict(mod), type="l" ) )
with(bruth, points(p, g, col="red") )
with(bruth, lines(p, predict(mod2), lty=3, lwd=2, col="red") )
title(main="Different uses of I() and interaction")
它突出了选择(任意?)连接点对分段回归输出的影响。