我有一个示例数据框,如下所示:
CNC[1:9,]
ID CNDIST1 CNDIST2
C1 0,0,136,1 0,0,4,2
C2 0,0,141,1 0,0,4,1
C3 6,8,126 0,0,4
C4 6,11,125 0,0,6
C5 0,0,141,0,0,0,0 0,0,3,0,1,0,1
C6 0,0,139,0,0 0,0,3,1,1
C7 0,0,141,0 0,0,4,2
C8 0,0,141,0 0,0,4,2
C9 31,44,61 2,2,2
使用dput()的相同数据框:
dput(CNC[1:9,])
structure(list(ID = structure(1:9, .Label = c("C1",
"C2", "C3", "C4",
"C5", "C6", "C7",
"C8", "C9"), class = "factor"),
CNDIST1 = structure(c(1L, 5L, 8L, 7L, 4L, 2L, 3L, 3L,
6L), .Label = c("0,0,136,1", "0,0,139,0,0", "0,0,141,0",
"0,0,141,0,0,0,0", "0,0,141,1", "31,44,61", "6,11,125", "6,8,126"
), class = "factor"), CNDIST2 = structure(c(5L, 4L, 3L,
6L, 1L, 2L, 5L, 5L, 7L), .Label = c("0,0,3,0,1,0,1", "0,0,3,1,1",
"0,0,4", "0,0,4,1", "0,0,4,2", "0,0,6", "2,2,2"), class = "factor")), .Names = c("ID",
"CNDIST1", "CNDIST2"), row.names = c(NA, 9L), class = "data.frame")
我正在使用下面的Rcode来做chisq.test。第3列中的值形成概率向量' p'相同长度的数字向量' x'来自第2栏
read.table("report.dat2",header=T,sep="\t")->CNC
chi.pval=vector()
for(i in 1:nrow(CNC)){
as.numeric(unlist(strsplit(as.character(CNC$CNDIST1[i]),",")))->x
as.numeric(unlist(strsplit(as.character(CNC$CNDIST2[i]),",")))->p
chi.pval[i]<-chisq.test(x,p+0.001,rescale.p=T)$p.value ###add 0.001 to 'p' vector to remove '0'
}
CNC1<-cbind(CNC,chi.pval)
write.table(CNC1,'chi.test.txt',sep='\t',quote=F,row.names=F)
代码返回错误:
Error in chisq.test(x, p + 0.001, rescale.p = T) :
'x' and 'y' must have at least 2 levels
In addition: Warning messages:
1: In chisq.test(x, p + 0.001, rescale.p = T) :
Chi-squared approximation may be incorrect
代码显示在某些行和退出时执行chisq.test时出错。但是它会对数据帧的某些行进行测试。有没有人提供线索来找出解决这个问题的方法?
dput(CNC [1:9,]))缺少DIST1和DIST2中的某些值。看起来好像发生了重复值的事情。