如何过滤在列中具有多于n个条目的data.frame

时间:2015-02-11 12:54:28

标签: r

如何从具有n个以上基因的数据集中删除行。

data1 <- Re_leve logp   chr  start  end     CNA     Genes 
             1  1.5     1   739400  756200  gain    Trp1,Eggier 
             1  8.3     1   127730  128210  gain    Zranb3,R3hdm1,.....

2 个答案:

答案 0 :(得分:2)

您可以尝试

library(stringr)
n <- 1
df1[!str_count(df1$Genes, ',')+1 >n,]

答案 1 :(得分:2)

试试这个:

#dummy data
data1 <- data.frame(x=1:3,
                    Gene=c("asdf,asdf,ee,d","asdf","dfd,sdf"),
                    stringsAsFactors = FALSE)

#minimum number of genes
n <- 1

#subset
data1[sapply(data1$Gene,function(i)length(unlist(strsplit(i,",")))) > n, ]

#   x           Gene
# 1 1 asdf,asdf,ee,d
# 3 3        dfd,sdf