这是一门Coursera课程,希望我们在没有任何R经验的情况下进行R编程,我真的很难理解,但是没有任何线索。我甚至检查了基本的R教程,但仍然不知道。
我们有一个csv文件,内容:
对于非残疾人,请进行二项式测试,看看他们对触摸板的偏好是否与偶然性有显着差异。到最接近的万分之一(四位数),p值是多少?提示:运行二项式测试,比较喜欢触摸板的非残疾人行数与非残疾人数量。有两个可能的偏好,触摸板和轨迹球,机会概率为1/2。不要纠正多重比较;考虑这是对数据子集的单一测试。
解决方案应该是:
首先,通过绘制非残疾人的偏好来获得直觉:
plot(df[df$Disability == "0",]$Pref)
其次,测试触控板与轨迹球的偏好对机会,这不是偏好:
binom.test(sum(df[df$Disability == "0",]$Pref == "touchpad"),
nrow(df[df$Disability == "0",]), p=1/2)
plot(df[df$Disability == "0",]$Pref)
我理解,这应该给我们一个Disability = 0的首选项的直观表示,但dfs有一个错误,我不知道如何纠正它。有人可以帮忙吗?
答案 0 :(得分:0)
我模拟了一个具有给定特征的随机数据集,一切正常:
df <- data.frame(Subject = c("Sub1", "Sub2", "Sub3", "Sub4", "Sub5", "Sub6", "Sub7", "Sub8", "Sub9", "Sub10", "Sub11", "Sub12", "Sub13", "Sub14", "Sub15", "Sub16", "Sub17", "Sub18", "Sub19", "Sub20", "Sub21", "Sub22", "Sub23", "Sub24", "Sub25", "Sub26", "Sub27", "Sub28", "Sub29", "Sub30"),
Disability = c("0", "0", "1", "1", "1", "1", "0", "0", "0", "1", "1", "0", "0", "0", "0", "1", "0", "0", "1", "0", "0", "0", "0", "1", "1", "1", "0", "0", "1", "0"),
Pref = c("touchpad", "touchpad", "touchpad", "trackball", "trackball", "trackball", "trackball", "trackball", "trackball", "trackball", "trackball", "trackball", "touchpad", "trackball", "trackball", "touchpad", "touchpad", "trackball", "touchpad", "trackball", "touchpad", "touchpad", "trackball", "touchpad", "touchpad", "touchpad", "touchpad", "touchpad", "trackball", "trackball"))
给定命令的结果如下
binom.test(sum(df[df$Disability == "0",]$Pref == "touchpad"),
nrow(df[df$Disability == "0",]), p=1/2)
Exact binomial test
data: sum(df[df$Disability == "0", ]$Pref == "touchpad") and nrow(df[df$Disability == "0", ])
number of successes = 8, number of trials = 18, p-value = 0.8145
alternative hypothesis: true probability of success is not equal to 0.5
95 percent confidence interval:
0.2153015 0.6924283
sample estimates:
probability of success
0.4444444
修改
为了将相同的测试应用于实际数据(链接到注释中给出的文件),第一步应该被读取存储在实际数据中的值的命令所取代 - 框架:
df <- read.csv("deviceprefs-1.csv")
同样,执行二项式测试的给定命令与真实数据集一起工作正常。