过滤掉空格的行名称

时间:2016-03-28 08:22:44

标签: r

我想过滤出rownames的第四部分($V4)以u-开头的行

> head(miraligner_filterCRC_filterMM)
                                                       100G 100R 106G 106R 122G 122R 124G 124R 126G 126R 134G 134R
hsa-miR-1296-5p TTAGGGCCCTGGCTCCATCT 0 0 0 u-CC          23   17   11   21   29   14   16   20   11    1   37   13
hsa-miR-887-3p GTGAACGGGCGCCATCCCGAGGCTT 0 0 0 d-CTT      3    8    0    4    4    3    2   12   12    3    4    8
hsa-miR-454-3p TAGTGCAATATTGCTTATAGGGTAT 0 u-AT 0 0       2    1    1    0    0    0    0    1    2    1    8    2
hsa-miR-200b-3p TAATACTGCCTGGTAATGATGACTG 0 u-TG 0 d-C    2    1    0    0    0    0    0    1    0    2    2    2
hsa-miR-200b-3p TAATACTGCCTGGTAATGATGACTA 0 u-TA 0 d-C    6    6    5    0    3    3    1    4    1    1    8    5
hsa-miR-200b-3p TAATACTGCCTGGTAATGATGACTT 0 u-TT 0 d-C   22   41   12   26   25   51    2   25    2   24   91   51

我的尝试

table(read.table(text=rownames(miraligner_filterCRC_filterMM))$V4=="u-"

2 个答案:

答案 0 :(得分:2)

数据

x <- c("hsa-miR-1296-5p TTAGGGCCCTGGCTCCATCT 0 0 0 u-CC       ", 
       "hsa-miR-887-3p GTGAACGGGCGCCATCCCGAGGCTT 0 0 0 d-CTT  ", 
       "hsa-miR-454-3p TAGTGCAATATTGCTTATAGGGTAT 0 u-AT 0 0   ", 
       "hsa-miR-200b-3p TAATACTGCCTGGTAATGATGACTG 0 u-TG 0 d-C", 
       "hsa-miR-200b-3p TAATACTGCCTGGTAATGATGACTA 0 u-TA 0 d-C", 
       "hsa-miR-200b-3p TAATACTGCCTGGTAATGATGACTT 0 u-TT 0 d-C")

在用空格分隔的3组非空格字符后匹配“u-”:

grep("^(?:[^ ]+ ){3}u-",x,value=TRUE)

# [1] "hsa-miR-454-3p TAGTGCAATATTGCTTATAGGGTAT 0 u-AT 0 0   "
# [2] "hsa-miR-200b-3p TAATACTGCCTGGTAATGATGACTG 0 u-TG 0 d-C"
# [3] "hsa-miR-200b-3p TAATACTGCCTGGTAATGATGACTA 0 u-TA 0 d-C"
# [4] "hsa-miR-200b-3p TAATACTGCCTGGTAATGATGACTT 0 u-TT 0 d-C"

答案 1 :(得分:1)

columnSplitted<- strsplit(miraligner_filterCRC_filterMM$V4,'-')
part1<- unlist(columnSplitted)[2*(1:nrow(miraligner_filterCRC_filterMM))-1]
miraligner_filterCRC_filterMM[part1!="u",]