我有两种不同基因变异的表达样本数据:
value<-cbind(c(rnorm(100,500,90),rnorm(100,800,120)))
genotype<-cbind(c(rep("A",100),rep("B",100)))
df<-cbind(value,genotype)
df<-as.data.frame(df)
colnames(df)<-c("value","genotype")
df$value<-as.numeric(as.character(df$value))
我通过它们的表达绘制了这两种基因型变体,并试图确定区分它们的分析的最佳截断值:
d <- density(value)
plot(d, main="Genotypes A and B", ,type="n",xlim=c(200,1100),ylim=c(0,0.005),xlab="Units of expression",ylab="")
d1 <- density(subset(value,genotype=="A"))
polygon(d1, col = adjustcolor('gray', alpha.f = .40), border="black")
d2 <- density(subset(value,genotype=="B"))
polygon(d2, col = adjustcolor('gray', alpha.f = .40), border="black")
显然我可以使用“abline”函数来找到两个密度之间的最佳截止值,但有没有更简洁的方法来识别截止值?
答案 0 :(得分:0)
以下是基于@Julius提供的链接的答案:
A.density <- density(subset(df, genotype == "A")$value, from = min(df$value), to = max(df$value), n = 2^10)
B.density <- density(subset(df, genotype == "B")$value, from = min(df$value), to = max(df$value), n = 2^10)
intersection.point <- A.density$x[which(diff((A.density$y - B.density$y) > 0) != 0) + 1]