我对以下数据框有疑问:
genes <- matrix(c("chr1","chr2","chr2","chr2","chr2","chr2",
"uc001upw.2","uc001upw.2","uc001upw.2","uc001upx.1","uc001upy.1","uc001upz.1",
"188001308","188001308","188001308","188037202","188037202","188037202",
"188021266","188021266","188021266","188086618","188127464","188127464",
"-","-","-","-","-","-",
"CARCRL","CALCRL","CALCRL","TFPI","TFPI","TFPI",
"uc001upx.1","uc00upy.1","uc001upz.1","uc001upw.2","uc001upw.2","uc001upw.2",
"188037202","188037202","188037202","188001308","188001308","188001308",
"188086618","188127464","188127464","188021266","188021266","188021266",
"-","-","-","-","-","-",
"TFPI","TFPI","TFPI","CALCRL","CALCRL","CALCRL",
"35894","35894","35894","35894","35894","35894"), nrow=6)
colnames(genes)<- c("chr","names.x","start.x","stop.x","strand.x","alias.x","name.y","start.y","stop.y","strand.y", "alias.y", "distance_startsite")
genes<-as.data.frame(genes)
在数据框中,您可以看到前三行在names.x和names.y中是唯一的。 第4,5和6行不是唯一的,它们仅以相反的方式显示。 我的问题是:有没有办法过滤这个?
谢谢! 萨曼莎
答案 0 :(得分:1)
我确信这不是最漂亮的方式,但它完成了工作:
genes[!duplicated(t(apply(genes[,c('names.x','name.y')],1,sort))),]