R二项式测试首选项,数据框架

时间:2016-05-06 11:47:38

标签: r

这是一门Coursera课程,希望我们在没有任何R经验的情况下进行R编程,我真的很难理解,但是没有任何线索。我甚至检查了基本的R教程,但仍然不知道。

我们有一个csv文件,内容:

  • 主题:30
  • 残疾:0,1
  • 偏好:轨迹球,触控板

对于非残疾人,请进行二项式测试,看看他们对触摸板的偏好是否与偶然性有显着差异。到最接近的万分之一(四位数),p值是多少?提示:运行二项式测试,比较喜欢触摸板的非残疾人行数与非残疾人数量。有两个可能的偏好,触摸板和轨迹球,机会概率为1/2。不要纠正多重比较;考虑这是对数据子集的单一测试。

解决方案应该是:

  • 首先,通过绘制非残疾人的偏好来获得直觉:

    plot(df[df$Disability == "0",]$Pref)
    
  • 其次,测试触控板与轨迹球的偏好对机会,这不是偏好:

    binom.test(sum(df[df$Disability == "0",]$Pref == "touchpad"), 
               nrow(df[df$Disability == "0",]), p=1/2)
    plot(df[df$Disability == "0",]$Pref)
    

我理解,这应该给我们一个Disability = 0的首选项的直观表示,但dfs有一个错误,我不知道如何纠正它。有人可以帮忙吗?

1 个答案:

答案 0 :(得分:0)

我模拟了一个具有给定特征的随机数据集,一切正常:

df <- data.frame(Subject = c("Sub1", "Sub2", "Sub3", "Sub4", "Sub5", "Sub6", "Sub7", "Sub8", "Sub9", "Sub10", "Sub11", "Sub12", "Sub13", "Sub14", "Sub15", "Sub16", "Sub17", "Sub18", "Sub19", "Sub20", "Sub21", "Sub22",     "Sub23", "Sub24", "Sub25", "Sub26", "Sub27", "Sub28", "Sub29", "Sub30"),
                 Disability = c("0", "0", "1", "1", "1", "1", "0", "0", "0", "1", "1", "0", "0", "0", "0", "1", "0", "0", "1", "0", "0", "0", "0", "1", "1", "1", "0", "0", "1", "0"),
                 Pref = c("touchpad", "touchpad", "touchpad", "trackball", "trackball", "trackball", "trackball", "trackball", "trackball", "trackball", "trackball", "trackball", "touchpad", "trackball", "trackball", "touchpad", "touchpad", "trackball", "touchpad", "trackball", "touchpad", "touchpad", "trackball", "touchpad", "touchpad", "touchpad", "touchpad", "touchpad", "trackball", "trackball"))

给定命令的结果如下

binom.test(sum(df[df$Disability == "0",]$Pref == "touchpad"), 
           nrow(df[df$Disability == "0",]), p=1/2)

    Exact binomial test

data:  sum(df[df$Disability == "0", ]$Pref == "touchpad") and nrow(df[df$Disability == "0", ])
number of successes = 8, number of trials = 18, p-value = 0.8145
alternative hypothesis: true probability of success is not equal to 0.5
95 percent confidence interval:
 0.2153015 0.6924283
sample estimates:
probability of success 
             0.4444444 

修改

为了将相同的测试应用于实际数据(链接到注释中给出的文件),第一步应该被读取存储在实际数据中的值的命令所取代 - 框架:

df <- read.csv("deviceprefs-1.csv")

同样,执行二项式测试的给定命令与真实数据集一起工作正常。