使用分类变量在R中执行T检验

时间:2018-08-02 10:34:40

标签: r t-test

嘿,我正在尝试进行t检验,但看起来好像有问题... 数据如下:

pot pair    type    height
I   1   Cross   23,5
I   1   Self    17,375
I   2   Cross   12
I   2   Self    20,375

我的t检验是:

    darwin <- read.table("darwin.txt", header=T)
    plot(darwin$type, darwin$height, ylab="Height")
    darwin.no.outlier = subset(darwin, height>13)
    tapply(darwin.no.outlier$height, darwin.no.outlier$type, var) 
    t.test(darwin$height ~ darwin$type)

R给我的错误如下:

错误
if (stderr < 10 * .Machine$double.eps * max(abs(mx), abs(my))) stop("data are essentially constant") : 
  missing value where TRUE/FALSE needed

此外:警告消息:

1:默认为[x]:argument is not numeric or logical: returning NA

2:在var(x)中:

Calling var(x) on a factor x is deprecated and will become an error.
  Use something like 'all(duplicated(x)[-1L])' to test for a constant vector.

3:平均值:默认(y):argument is not numeric or logical: returning NA

4:在var(y)中:

Calling var(x) on a factor x is deprecated and will become an error.
  Use something like 'all(duplicated(x)[-1L])' to test for a constant vector.

1 个答案:

答案 0 :(得分:2)

问题出在小数点后,它是逗号而不是列height中的点。由于小数点之间用逗号分隔,因此您的列已转换为因数,因此会出现错误。

导入数据时,在"dec = ","中插入read.table(这是文件中用于小数点的字符)。所以我的例子与您的数据:

darwin <- read.table(text = "pot pair    type    height
I   1   Cross   23,5
           I   1   Self    17,375
           I   2   Cross   12
           I   2   Self    20,375", header = TRUE, dec = ",")

然后是

的输出
t.test(darwin$height ~ darwin$type)

这是:

    Welch Two Sample t-test

data:  darwin$height by darwin$type
t = -0.18932, df = 1.1355, p-value = 0.878
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
 -58.34187  56.09187
sample estimates:
mean in group Cross  mean in group Self 
             17.750              18.875