推理()函数坚持使用ANOVA与双边假设检验; R / RStudio

时间:2014-10-04 22:21:19

标签: r function rstudio anova inference

我尝试使用名为Inference()的自定义函数,如下面的代码所示。该函数没有文档,但它来自Coursera中的DASI类。根据我收到的反馈,我正在使用该功能。我试图在我的类变量和我的单词变量之间进行双向假设检验,即在低级类和工人类的两种方法之间进行。所以,工人阶级的平均单词 - 低级别的平均单词。但是,函数/ R / R Studio一直坚持我进行ANOVA测试。这对我来说不起作用,因为我试图拒绝空值,并在两个独立均值之间创建置信区间。我看过这个功能,但由于我不是R专家,所以我看不到任何与众不同的东西。非常感谢任何帮助。

代码:

load(url("http://bit.ly/dasi_gss_ws_cl"))
source("http://bit.ly/dasi_inference")

summary(gss)
by(gss$wordsum, gss$class, mean)
boxplot(gss$wordsum ~ gss$class)

gss_clean = na.omit(subset(gss, class == "WORKING" | class =="LOWER"))

inference(y = gss_clean$wordsum, x = gss_clean$class, est = "mean", type = "ht", 
          null = 0, alternative = "twosided", method = "theoretical")

返回:

Response variable: numerical, Explanatory variable: categorical
Error: Use alternative = 'greater' for ANOVA or chi-square test.
In addition: Warning message:
Ignoring null value since it's undefined for ANOVA.

1 个答案:

答案 0 :(得分:2)

你需要

gss_clean <- droplevels(gss_clean)

然后您的inference()来电有效:

Response variable: numerical, Explanatory variable: categorical
Difference between two means
Summary statistics:
n_LOWER = 41, mean_LOWER = 5.0732, sd_LOWER = 2.2404
n_WORKING = 407, mean_WORKING = 5.7494, sd_WORKING = 1.8652
Observed difference between means (LOWER-WORKING) = -0.6762
H0: mu_LOWER - mu_WORKING = 0 
HA: mu_LOWER - mu_WORKING != 0 
Standard error = 0.362 
Test statistic: Z =  -1.868 
p-value =  0.0616 

问题在于,除非你删除未使用的因子级别,inference()的内部机制认为你有一个4级分类变量,它不能进行t检验或等效的2类测试:它必须进行单向ANOVA或模拟。