我有两组数据: Set1和Set2 。
对于每组,我们有相同的变量 A,B,C,D,E 。
我想进行 F-test 以了解以下关系是否同时成立:
Set1_A = Set2_A, Set1_B = Set2_B, Set1_C = Set2_C, Set1_D = Set2_D, Set1_E = Set2_E
Set1_A 和 Set2_A 可能是不同大小的矢量。
我如何在R?
中实现这一目标由于
Set1的样本数据:
A B C
11.0 11.0 11.0
23.3 23.3 23.3
44.6 -1.3 -7.1
-1.9 -1.9 -1.9
Set2的样本数据:
A B C
3.9 3.9 3.9
-6.1 -6.1 -6.1
-34.6 -95.7 -102.4
7.0 7.0 7.0
答案 0 :(得分:2)
这说明了如何比较Set1_A和Set2_A。为了确定它们是否同时“真实”,您需要使用多变量分析
Set1 <- read.table(text="A B C
11.0 11.0 11.0
23.3 23.3 23.3
44.6 -1.3 -7.1
-1.9 -1.9 -1.9", header=TRUE)
Set2<- read.table(text="A B C
3.9 3.9 3.9
-6.1 -6.1 -6.1
-34.6 -95.7 -102.4
7.0 7.0 7.0", header=TRUE)
combset <- rbind(Set1, Set2)
combset$grp <- rep(c("Set1", "Set2"), times=c(nrow(Set1), nrow(Set2) ) )
combset
#----------------
A B C grp
1 11.0 11.0 11.0 Set1
2 23.3 23.3 23.3 Set1
3 44.6 -1.3 -7.1 Set1
4 -1.9 -1.9 -1.9 Set1
5 3.9 3.9 3.9 Set2
6 -6.1 -6.1 -6.1 Set2
7 -34.6 -95.7 -102.4 Set2
8 7.0 7.0 7.0 Set2
现在您的数据可能被称为长格式,您可以将grp ID用作lm.formula
调用中的一个因素:
lm(A ~ grp, data=combset)
Call:
lm(formula = A ~ grp, data = combset)
Coefficients:
(Intercept) grpSet2
19.25 -26.70
Warning message:
In model.matrix.default(mt, mf, contrasts) :
variable 'grp' converted to a factor
> anova(lm(A ~ grp, data=combset))
Analysis of Variance Table
Response: A
Df Sum Sq Mean Sq F value Pr(>F)
grp 1 1425.8 1425.78 3.8004 0.09913 .
Residuals 6 2251.0 375.16
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Warning message:
In model.matrix.default(mt, mf, contrasts) :
variable 'grp' converted to a factor
可以构建多变量模型。但是......你确定你能正确解释这一点,并且知道可能出现的统计问题吗?
> lm( A + B + C ~ grp, combset)
Call:
lm(formula = A + B + C ~ grp, data = combset)
Coefficients:
(Intercept) grpSet2
33.35 -87.92
Warning message:
In model.matrix.default(mt, mf, contrasts) :
variable 'grp' converted to a factor
> anova(lm( A + B + C ~ grp, combset))
Analysis of Variance Table
Response: A + B + C
Df Sum Sq Mean Sq F value Pr(>F)
grp 1 15462 15461.6 2.016 0.2055
Residuals 6 46017 7669.6
Warning message:
In model.matrix.default(mt, mf, contrasts) :
variable 'grp' converted to a factor
我担心这个答案,因为我认为应该估计更多的系数。我记得article in RNews by Peter Dalgaard并查了一下。这应该是我提供的:
> lm( cbind(A, B, C) ~ grp, combset)
Call:
lm(formula = cbind(A, B, C) ~ grp, data = combset)
Coefficients:
A B C
(Intercept) 19.250 7.775 6.325
grpSet2 -26.700 -30.500 -30.725
Warning message:
In model.matrix.default(mt, mf, contrasts) :
variable 'grp' converted to a factor
> anova(lm( cbind(A, B, C) ~ grp, combset))
Analysis of Variance Table
Df Pillai approx F num Df den Df Pr(>F)
(Intercept) 1 0.51946 1.44130 3 4 0.3557
grp 1 0.42690 0.99318 3 4 0.4813
Residuals 6
Warning message:
In model.matrix.default(mt, mf, contrasts) :
variable 'grp' converted to a factor
> class(lm( cbind(A, B, C) ~ grp, combset))
[1] "mlm" "lm"
Warning message:
In model.matrix.default(mt, mf, contrasts) :
variable 'grp' converted to a factor
请注意,提供了“真实的”多元推理统计(例如Pillai的跟踪或Wilks或Hotelling),并且呈现了A,B和C的三个独立系数,并且输出的类别是“mlm”而不仅仅是“LM”。您还应该查看?anova.mlm
。