基于多种条件的生物信息学过滤/选择

时间:2020-06-23 18:52:18

标签: sql r bioinformatics sqldf

嗨,我正在尝试选择构成某些KEGG路径的不同OTU,并且我想知道为什么以下方法不起作用或您会推荐什么。我尝试过dplyr并使用=!=<>没有成功。有什么建议吗?

Group1<-sqldf("SELECT DISTINCT OTU FROM 'retro.flux.avg.OTU'
WHERE Pathway IN ('ko00362','ko00625','ko00361','ko00623','ko00622','ko00633','ko00642','ko00626','ko00624')")
          AND Pathway IN ('ko02030')")

Group2<-sqldf("SELECT DISTINCT OTU FROM 'retro.flux.avg.OTU'
WHERE Pathway IN ('ko00362','ko00625','ko00361','ko00623','ko00622','ko00633','ko00642','ko00626','ko00624')")
          AND Pathway NOT IN ('ko02030')")

Group3<-sqldf("SELECT DISTINCT OTU FROM 'retro.flux.avg.OTU'
WHERE Pathway NOT IN ('ko00362','ko00625','ko00361','ko00623','ko00622','ko00633','ko00642','ko00626','ko00624')")
          AND Pathway IN ('ko02030')")


Group4<-sqldf("SELECT DISTINCT OTU FROM 'retro.flux.avg.OTU'
WHERE Pathway NOT IN ('ko02030','ko00362','ko00625','ko00361','ko00623','ko00622','ko00633','ko00642','ko00626','ko00624')")

1 个答案:

答案 0 :(得分:1)

这就是我的想法

代替

Group1<-sqldf("SELECT DISTINCT OTU FROM 'retro.flux.avg.OTU'
WHERE Pathway IN ('ko00362','ko00625','ko00361','ko00623','ko00622','ko00633','ko00642','ko00626','ko00624')") 
AND Pathway IN ('ko02030')")

只需放置以下内容

删除了多余的 "),我发现查询中还有1个错误并已解决

Group1<-sqldf("SELECT DISTINCT OTU FROM 'retro.flux.avg.OTU'
WHERE Pathway IN ('ko00362','ko00625','ko00361','ko00623','ko00622','ko00633','ko00642','ko00626','ko00624')")