选择嵌套列表元素的子集,其中列中的值为非NA

时间:2014-09-04 14:27:40

标签: r list

您好我有以下数据结构

> class(iso.table.2)
[1] "list"
> class(iso.table.2[1])
[1] "list"

即。嵌套列表。

我可以使用以下方法选择嵌套列表元素:

> head(iso.table.2[[2]][1:2])
    V4  V11
1 <NA> <NA>
2 <NA> <NA>
3 <NA> <NA>
4 <NA> <NA>
5 <NA> <NA>
6 <NA> <NA>

在每个列表元素中,每个表都有一些行,其中每个单元格都是NA。我想看看

  

iso.table.2 [[I]] [2]

在每个嵌套列表中,并且只选择此列中的值为非NA的子集。如果我有:

                    V4              V11
12                <NA>             <NA>
13                mrna             <NA>
14      XP_005268469.1             <NA>
15                <NA>             <NA>
16 dbSNP rs#cluster id         Function
17         rs369314432       synonymous
18                     contig reference

我只想检索

                    V4              V11
16 dbSNP rs#cluster id         Function
17         rs369314432       synonymous
18                     contig reference

我尝试了以下内容:

iso.table.2[!is.na(iso.table.2)]

因为我习惯于使用数据帧。然而,这只是返回整个嵌套的列表元素,显然忽略了`!is.na。

我能做的是使用lapply(iso.table.2,'is.na')测试NA值。但这只是一个考验。

任何人都可以提供帮助。

这里的参考是我的dput()

> dput(iso.table.2[[1]])
structure(list(V4 = structure(c(NA, NA, NA, NA, NA, NA, 5L, 5L, 
5L, 5L, 5L, NA, 3L, 4L, NA, 2L, 18L, 1L, 36L, 1L, 13L, 1L, 29L, 
1L, 71L, 1L, 45L, 1L, 32L, 1L, 63L, 1L, 27L, 1L, 12L, 1L, 46L, 
1L, 72L, 1L, 9L, 1L, 55L, 1L, 19L, 1L, 21L, 1L, 68L, 1L, 11L, 
1L, 74L, 1L, 20L, 1L, 77L, 1L, 33L, 1L, 22L, 1L, 76L, 1L, 38L, 
1L, 28L, 1L, 69L, 1L, 25L, 1L, 47L, 1L, 58L, 1L, 30L, 1L, 1L, 
10L, 1L, 42L, 1L, 49L, 1L, 6L, 1L, 75L, 1L, 64L, 1L, 44L, 1L, 
62L, 1L, 17L, 1L, 70L, 1L, 51L, 1L, 56L, 1L, 8L, 1L, 41L, 1L, 
53L, 1L, 60L, 1L, 16L, 1L, 54L, 1L, 7L, 1L, 59L, 1L, 43L, 1L, 
61L, 1L, 48L, 1L, 34L, 1L, 65L, 1L, 31L, 1L, 23L, 1L, 15L, 1L, 
40L, 1L, 26L, 1L, 24L, 1L, 67L, 1L, 37L, 1L, 35L, 1L, 50L, 1L, 
66L, 1L, 14L, 1L, 52L, 1L, 73L, 1L, 39L, 1L, 57L, 1L), .Label = c("", 
"dbSNP rs#cluster id", "mrna", "NP_001139512.1", "reverse", "rs112560122", 
"rs112970419", "rs114046477", "rs138173310", "rs139058916", "rs139213838", 
"rs140420721", "rs141039714", "rs141704405", "rs142888296", "rs143377552", 
"rs146034962", "rs146481873", "rs146911085", "rs147156518", "rs147471585", 
"rs148221542", "rs148952106", "rs149023945", "rs150716717", "rs151029950", 
"rs151276682", "rs17112272", "rs182383995", "rs192326771", "rs199547699", 
"rs199561280", "rs199618583", "rs199639315", "rs199910297", "rs200130685", 
"rs200171902", "rs200218897", "rs200604203", "rs200606738", "rs201280023", 
"rs201471742", "rs202233543", "rs2229962", "rs367564392", "rs368049933", 
"rs369314432", "rs369500388", "rs369800907", "rs369873008", "rs370809444", 
"rs370920306", "rs371331129", "rs371496582", "rs372342365", "rs372452903", 
"rs373114110", "rs373120024", "rs374319634", "rs374834686", "rs374949730", 
"rs374982085", "rs375152105", "rs375819582", "rs375840161", "rs376426309", 
"rs376610236", "rs377129430", "rs377190917", "rs377524694", "rs62636580", 
"rs62636581", "rs74542605", "rs75463357", "rs76872663", "rs77451630", 
"rs78363193"), class = "factor"), V11 = structure(c(NA, NA, NA, 
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 2L, 3L, 1L, 3L, 
1L, 3L, 1L, 3L, 1L, 5L, 1L, 3L, 1L, 3L, 1L, 4L, 1L, 3L, 1L, 3L, 
1L, 5L, 1L, 3L, 1L, 3L, 1L, 3L, 1L, 3L, 1L, 3L, 1L, 5L, 1L, 3L, 
1L, 5L, 1L, 3L, 1L, 5L, 1L, 3L, 1L, 3L, 1L, 3L, 1L, 3L, 1L, 5L, 
1L, 5L, 1L, 5L, 1L, 5L, 1L, 3L, 1L, 5L, 5L, 1L, 3L, 1L, 3L, 1L, 
3L, 1L, 5L, 1L, 5L, 1L, 3L, 1L, 3L, 1L, 5L, 1L, 5L, 1L, 5L, 1L, 
5L, 1L, 3L, 1L, 5L, 1L, 5L, 1L, 3L, 1L, 3L, 1L, 5L, 1L, 5L, 1L, 
5L, 1L, 5L, 1L, 3L, 1L, 3L, 1L, 5L, 1L, 3L, 1L, 5L, 1L, 3L, 1L, 
5L, 1L, 3L, 1L, 5L, 1L, 3L, 1L, 5L, 1L, 5L, 1L, 3L, 1L, 3L, 1L, 
3L, 1L, 3L, 1L, 3L, 1L, 3L, 1L, 3L, 1L, 5L, 1L, 3L, 1L), .Label = c("contig reference", 
"Function", "missense", "nonsense", "synonymous"), class = "factor"), 
    V12 = structure(c(NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
    NA, NA, NA, NA, NA, 3L, 1L, 2L, 1L, 4L, 5L, 4L, 5L, 2L, 5L, 
    2L, 1L, 5L, 1L, 4L, 5L, 2L, 5L, 1L, 4L, 2L, 2L, 5L, 5L, 2L, 
    2L, 1L, 1L, 2L, 4L, 5L, 1L, 4L, 4L, 1L, 1L, 4L, 1L, 4L, 2L, 
    4L, 4L, 1L, 1L, 4L, 2L, 5L, 4L, 1L, 2L, 4L, 5L, 2L, 2L, 5L, 
    1L, 4L, 5L, 1L, 5L, 4L, 1L, 5L, 2L, 5L, 2L, 2L, 5L, 4L, 1L, 
    1L, 4L, 2L, 4L, 1L, 4L, 1L, 4L, 4L, 1L, 4L, 1L, 1L, 4L, 5L, 
    2L, 5L, 2L, 5L, 2L, 5L, 2L, 2L, 5L, 5L, 2L, 1L, 4L, 2L, 5L, 
    5L, 2L, 1L, 4L, 2L, 5L, 4L, 1L, 5L, 2L, 1L, 2L, 5L, 2L, 5L, 
    2L, 2L, 5L, 1L, 4L, 5L, 2L, 1L, 4L, 2L, 4L, 5L, 2L, 5L, 2L, 
    1L, 4L, 1L, 2L, 2L, 5L, 2L, 4L, 4L, 2L, 2L, 4L, 1L, 5L, 1L, 
    5L), .Label = c("A", "C", "dbSNPallele", "G", "T"), class = "factor"), 
    V13 = structure(c(NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
    NA, NA, NA, NA, NA, 17L, 4L, 10L, 10L, 3L, 11L, 14L, 6L, 
    3L, 15L, 15L, 11L, 15L, 7L, 3L, 1L, 3L, 12L, 14L, 6L, 18L, 
    16L, 16L, 12L, 16L, 19L, 4L, 5L, 2L, 2L, 18L, 10L, 3L, 9L, 
    9L, 7L, 3L, 3L, 3L, 19L, 3L, 3L, 3L, 7L, 3L, 16L, 12L, 9L, 
    8L, 7L, 8L, 2L, 2L, 2L, 2L, 22L, 22L, 2L, 2L, 22L, 9L, 18L, 
    18L, 18L, 6L, 3L, 12L, 15L, 22L, 11L, 12L, 12L, 3L, 3L, 7L, 
    3L, 5L, 9L, 19L, 19L, 8L, 8L, 12L, 12L, 11L, 11L, 11L, 19L, 
    16L, 16L, 12L, 12L, 19L, 11L, 20L, 3L, 12L, 12L, 12L, 12L, 
    4L, 4L, 12L, 12L, 16L, 12L, 5L, 4L, 16L, 16L, 8L, 5L, 4L, 
    4L, 20L, 3L, 21L, 21L, 14L, 22L, 4L, 4L, 13L, 3L, 18L, 18L, 
    18L, 18L, 15L, 18L, 10L, 3L, 18L, 3L, 18L, 15L, 5L, 8L, 22L, 
    12L, 16L, 3L, 19L, 19L, 13L, 4L), .Label = c("", "Ala [A]", 
    "Arg [R]", "Asn [N]", "Asp [D]", "Cys [C]", "Gln [Q]", "Glu [E]", 
    "Gly [G]", "His [H]", "Ile [I]", "Leu [L]", "Lys [K]", "Met [M]", 
    "Phe [F]", "Pro [P]", "Proteinresidue", "Ser [S]", "Thr [T]", 
    "Trp [W]", "Tyr [Y]", "Val [V]"), class = "factor"), V14 = structure(c(NA, 
    NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 4L, 
    1L, 1L, 2L, 2L, 3L, 3L, 1L, 1L, 3L, 3L, 1L, 1L, 2L, 2L, 1L, 
    1L, 1L, 1L, 2L, 2L, 3L, 3L, 2L, 2L, 2L, 2L, 2L, 2L, 1L, 1L, 
    2L, 2L, 3L, 3L, 2L, 2L, 3L, 3L, 2L, 2L, 3L, 3L, 2L, 2L, 2L, 
    2L, 2L, 2L, 1L, 1L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 2L, 2L, 
    3L, 3L, 3L, 1L, 1L, 1L, 1L, 1L, 1L, 3L, 3L, 3L, 3L, 2L, 2L, 
    2L, 2L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 2L, 2L, 3L, 3L, 1L, 
    1L, 2L, 2L, 1L, 1L, 3L, 3L, 1L, 1L, 3L, 3L, 3L, 3L, 2L, 2L, 
    1L, 1L, 3L, 3L, 3L, 3L, 3L, 3L, 1L, 1L, 3L, 3L, 1L, 1L, 3L, 
    3L, 2L, 2L, 3L, 3L, 3L, 3L, 2L, 2L, 2L, 2L, 1L, 1L, 2L, 2L, 
    3L, 3L, 1L, 1L, 2L, 2L, 3L, 3L, 3L, 3L), .Label = c("1", 
    "2", "3", "Codonpos"), class = "factor"), V15 = structure(c(NA, 
    NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 63L, 
    50L, 50L, 49L, 49L, 48L, 48L, 47L, 47L, 46L, 46L, 46L, 46L, 
    45L, 45L, 45L, 45L, 44L, 44L, 43L, 43L, 42L, 42L, 41L, 41L, 
    40L, 40L, 38L, 38L, 37L, 37L, 36L, 36L, 35L, 35L, 34L, 34L, 
    33L, 33L, 33L, 33L, 32L, 32L, 32L, 32L, 31L, 31L, 30L, 30L, 
    30L, 30L, 29L, 29L, 28L, 28L, 26L, 26L, 25L, 25L, 23L, 23L, 
    22L, 22L, 22L, 21L, 21L, 20L, 20L, 19L, 19L, 18L, 18L, 17L, 
    17L, 17L, 17L, 16L, 16L, 15L, 15L, 14L, 14L, 13L, 13L, 12L, 
    12L, 11L, 11L, 10L, 10L, 8L, 8L, 7L, 7L, 6L, 6L, 5L, 5L, 
    5L, 5L, 4L, 4L, 2L, 2L, 2L, 2L, 1L, 1L, 62L, 62L, 61L, 61L, 
    60L, 60L, 59L, 59L, 57L, 57L, 55L, 55L, 54L, 54L, 51L, 51L, 
    39L, 39L, 27L, 27L, 27L, 27L, 24L, 24L, 24L, 24L, 9L, 9L, 
    3L, 3L, 58L, 58L, 56L, 56L, 53L, 53L, 52L, 52L), .Label = c("104", 
    "113", "13", "130", "145", "150", "158", "164", "17", "174", 
    "179", "195", "212", "219", "232", "233", "241", "244", "262", 
    "270", "280", "296", "297", "30", "300", "305", "31", "330", 
    "331", "341", "343", "344", "347", "349", "366", "369", "373", 
    "374", "39", "397", "402", "403", "406", "412", "413", "416", 
    "428", "440", "450", "455", "48", "5", "6", "66", "67", "8", 
    "86", "9", "93", "97", "98", "99", "Amino acidpos"), class = "factor"), 
    V16 = structure(c(NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
    NA, NA, NA, NA, NA, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
    1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
    1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
    1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
    1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
    1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
    1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
    1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
    1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
    1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
    1L), .Label = c("", "PubMed"), class = "factor")), .Names = c("V4", 
"V11", "V12", "V13", "V14", "V15", "V16"), row.names = c(NA, 
-161L), class = "data.frame")

1 个答案:

答案 0 :(得分:2)

所以,如果你有样本数据

iso.table.2 <- list(
    data.frame(a=1:5, b=c(7,NA,9,6,8)),
    data.frame(a=6:10, b=c(NA,3,2,NA,1))
)

并且您希望每个数据帧的第二列中的所有值都不是NA

lapply(iso.table.2, function(x) Filter(Negate(is.na), x[[2]]))

获取

[[1]]
[1] 7 9 6 8

[[2]]
[1] 3 2 1