Question

我有一个包含几个变量的数据集：

X是一个数值变量，Y和Z是只包含2个因子的因子变量（Y = 1,2 Z = 3,4）

             x y z
1  -0.59131983 1 3
2   1.51800178 1 3
3   0.03079412 1 3
4  -0.43881764 1 3
5  -1.44914000 1 3
6  -1.33483914 1 4
7   0.25612595 1 4
8   0.12606742 1 4
9   0.44735965 1 4
10  1.83294817 1 4
11 -0.59131983 2 3
12  1.51800178 2 3
13  0.03079412 2 3
14 -0.43881764 2 3
15 -1.44914000 2 3
16 -1.33483914 2 4
17  0.25612595 2 4
18  0.12606742 2 4
19  0.44735965 2 4
20  1.83294817 2 4

如果我的因子变量是Y（t.test（X~Y）），则很容易执行t检验。但我不知道如何进行t检验，例如只比较Y == 2，Z（3和4）之间的X值？

我不确定我是否表达自己是正确的，所以在表格中看到它可能更容易。所以，我想对X进行t检验，其中因子变量是Z，Y == 2。我怎么能这样做？在STATA很容易： ttest var1 if var3 == 3，by（var2）

但我不知道R :(

         x     y  z
11 -0.59131983 2 3
12  1.51800178 2 3
13  0.03079412 2 3
14 -0.43881764 2 3
15 -1.44914000 2 3
16 -1.33483914 2 4
17  0.25612595 2 4
18  0.12606742 2 4
19  0.44735965 2 4
20  1.83294817 2 4

Answer 1

如果您阅读R中的t.test文档，您将看到对于单样本t.tests，您不应使用该函数的公式接口（类型?t.test）：

公式界面仅适用于双样本测试。

因此，在您的情况下，您需要根据您指定的条件创建data.frame的子集：

df2 <- df[df$y==2 & df$z %in% c(3,4), ]

> df2
             x y z
11 -0.59131983 2 3
12  1.51800178 2 3
13  0.03079412 2 3
14 -0.43881764 2 3
15 -1.44914000 2 3
16 -1.33483914 2 4
17  0.25612595 2 4
18  0.12606742 2 4
19  0.44735965 2 4
20  1.83294817 2 4

然后使用以下语法运行单样本t.test：

> t.test(x=df2$x)

    One Sample t-test

data:  df2$x
t = 0.1171, df = 9, p-value = 0.9094
alternative hypothesis: true mean is not equal to 0
95 percent confidence interval:
 -0.7275964  0.8070325
sample estimates:
 mean of x 
0.03971805

通过特定因素在R中进行t检验

1 个答案: