如果您有一个向量列表,那么确定哪些列表元素包含特定记录的好方法是什么?
MsgBox HTMLDoc(i).innerHTML
aList的输出如下所示:
set.seed(8675309)
aList <- list(v1=sample(LETTERS, 20),
v2=sample(LETTERS, 10))
我想要这样的事情:
> aList
$v1
[1] "E" "L" "S" "R" "F" "O" "T" "Q" "P" "H" "N" "I" "X" "D" "U" "K" "W" "B" "G" "V"
$v2
[1] "B" "V" "U" "H" "M" "O" "F" "Z" "C" "N"
答案 0 :(得分:3)
names(aList)[sapply(1:2,function(x){"B" %in% aList[[x]]})]
[1] "v1" "v2"
names(aList)[sapply(1:2,function(x){"E" %in% aList[[x]]})]
[1] "v1"
names(aList)[sapply(1:2,function(x){"C" %in% aList[[x]]})]
[1] "v2"
如果您的列表包含未知数量的元素,请使用seq_along
:
names(aList)[sapply(seq_along(aList),function(x){"B" %in% aList[[x]]})]
[1] "v1" "v2"
这是关于评论的微基准。
microbenchmark(seq_along(aList),seq_along(names(aList)),1:length(aList),times=100000)
Unit: nanoseconds
expr min lq mean median uq max neval cld
seq_along(aList) 350 700 659.9117 701 701 208228 1e+05 a
seq_along(names(aList)) 351 701 857.1508 701 1051 216977 1e+05 b
1:length(aList) 700 701 935.7251 1050 1051 424855 1e+05 c
microbenchmark(etienne(),roland())
Unit: microseconds
expr min lq mean median uq max neval cld
etienne() 40.597 41.297 45.24751 41.646 41.997 211.378 100 b
roland() 12.600 13.300 14.40882 14.699 15.049 20.998 100 a
答案 1 :(得分:1)
我们可以使用outer
和%in%
来获取逻辑矩阵(&#39; m1&#39;),split
来row
,而不是单独执行此操作并获得相应的names
&#39; aList&#39;。
v1 <- c('B', 'E', 'C')
m1 <- outer(v1, aList, FUN= Vectorize(`%in%`))
lapply(split(m1, row(m1)), function(x) names(aList)[x])
# $`1`
#[1] "v1" "v2"
#$`2`
#[1] "v1"
#$`3`
#[1] "v2"
或者我们melt
&#39; m1&#39;和split
长期&#39;格式列。
library(reshape2)
with(melt(m1), split(as.character(Var2[value]), Var1[value]))