什么能解释下面两个aov
之间的区别:
a = c(0.04875,0.13725,0.28350,0.50975,0.77425,0.94700,0.05325,0.14050,0.29725,0.51525,0.79000,0.95400,0.04625,0.15250,0.29000,0.53300,0.79825,0.95225,0.05025,0.14625,0.28800,0.52625,0.78200,0.95925,0.04700,0.14225,0.30325,0.53500,0.79325,0.95875,0.04775,0.13850,0.28675,0.54250,0.78300,0.95175,0.05150,0.12725,0.30175,0.54725,0.79475,0.96275,0.05375,0.14100,0.30050,0.53275,0.78100,0.96175,0.05450,0.15300,0.29650,0.52850,0.80100,0.95675,0.05425,0.13975,0.30875,0.56025,0.80575,0.96100,0.05100,0.15350,0.31175,0.53300,0.78900,0.96000,0.04650,0.13525,0.29600,0.53625,0.78475,0.96375,0.05375,0.13900,0.29600,0.53725,0.78700,0.95800,0.05075,0.14350,0.29225,0.54525,0.80275,0.95800,0.05050,0.13200,0.29850,0.52700,0.80525,0.96150,0.05150,0.14050,0.29450,0.54375,0.79450,0.96375,0.05375,0.13525,0.30475,0.55250,0.79425,0.96025,0.04950,0.14500,0.29425,0.52250,0.78475,0.95650,0.05225,0.14425,0.29225,0.53150,0.80425,0.95375)
b = c(4,4,4,4,4,4,6,6,6,6,6,6,8,8,8,8,8,8,10,10,10,10,10,10,12,12,12,12,12,12,14,14,14,14,14,14,16,16,16,16,16,16,18,18,18,18,18,18,20,20,20,20,20,20,22,22,22,22,22,22,24,24,24,24,24,24,26,26,26,26,26,26,28,28,28,28,28,28,30,30,30,30,30,30,32,32,32,32,32,32,34,34,34,34,34,34,36,36,36,36,36,36,38,38,38,38,38,38,40,40,40,40,40,40)
c = c(1,2,3,4,5,6,1,2,3,4,5,6,1,2,3,4,5,6,1,2,3,4,5,6,1,2,3,4,5,6,1,2,3,4,5,6,1,2,3,4,5,6,1,2,3,4,5,6,1,2,3,4,5,6,1,2,3,4,5,6,1,2,3,4,5,6,1,2,3,4,5,6,1,2,3,4,5,6,1,2,3,4,5,6,1,2,3,4,5,6,1,2,3,4,5,6,1,2,3,4,5,6,1,2,3,4,5,6,1,2,3,4,5,6)
summary(lm(a~b*as.factor(c)))
summary(lm(a~b*c))
使用as.factor时,是否认为c
被视为非序数?
答案 0 :(得分:2)
在这两种情况下,您都在探索a
作为b
和c
及其互动的函数。
当你强制c
到一个因子时,为c
的每个不同值计算虚拟变量(实际上是c
的每个级别,但在这种情况下所有级别存在,所以这些是相同的)。因此,探讨的互动介于c
与b
的每个值之间。
否则,探索的交互是数字变量的交互。
如果c
有更大的价值差异,那么差异可能更明显,即
c = c(1, 17, 2, 5, 131, 1, 4, 5, 2, 11, 17, 7, 1, 1, 17, .... etc)
此外,在学习R
时,请注意,请避免使用c
作为变量名称。它也是一个使用频繁的函数的名称,它会很快使代码无法读取并导致可能的混淆
答案 1 :(得分:0)
您可以通过查看其model.matrix()结果来检查创建的模型的结构,因为model.matrix
函数是lm
函数用于构造数据以供分析的结果。公式的RHS:
> dim(model.matrix(~b*as.factor(c)))
[1] 114 12
> dim( model.matrix(~b*c))
[1] 114 4
> colnames(model.matrix(~b*as.factor(c)))
[1] "(Intercept)" "b" "as.factor(c)2" "as.factor(c)3"
[5] "as.factor(c)4" "as.factor(c)5" "as.factor(c)6" "b:as.factor(c)2"
[9] "b:as.factor(c)3" "b:as.factor(c)4" "b:as.factor(c)5" "b:as.factor(c)6"
> colnames( model.matrix(~b*c))
[1] "(Intercept)" "b" "c" "b:c"
第二个模型中“c”变量的列名不会像第一个模型中那样分为不同的级别。 'b:c'-列将是'b'和'c'的乘积:
> describe(b*c)
b * c
n missing unique Mean .05 .10 .25 .50 .75 .90
114 0 67 77 12.0 16.6 30.5 62.0 111.5 160.0
.95
190.7
lowest : 4 6 8 10 12, highest: 200 204 216 228 240
> describe(model.matrix(~b*c)[, "b:c"])
model.matrix(~b * c)[, "b:c"]
n missing unique Mean .05 .10 .25 .50 .75 .90
114 0 67 77 12.0 16.6 30.5 62.0 111.5 160.0
.95
190.7
lowest : 4 6 8 10 12, highest: 200 204 216 228 240