给定一个列表,我如何访问具有不同大小的每个元素的最后一个值

时间:2018-07-15 22:13:05

标签: r list

我有一个字符向量列表,我想访问每个元素的最后一个值。

mylist<-list(A=c("a"),
             B=c("a","b"),
             C=c("a","b","c"),
             D=c("a","b","c","d"))

首先,(通过查看Python中的一些相关线程),我认为我可以做类似的事情:

for(i in 1:length(mylist)){
   print(mylist[[i]][-1])
}
# character(0)
# [1] "b"
# [1] "b" "c"
# [1] "b" "c" "d"

我想这行不通。结果,基本上,我想

myfunction<-function(mylist){
  output<-as.character()
  for(i in 1:length(mylist)){
  output<-c(output, mylist[[i]][length(mylist[[i]])])}
  return(output)
}

myfunction(mylist)
# [1] "a" "b" "c" "d"

有没有更有效的方法?

2 个答案:

答案 0 :(得分:4)

正如Rich Scriven在(已删除的)注释中指出的,有很多方法可以完成此任务,其中一种方法是将sapplytail与参数n = 1一起使用:

sapply(mylist, tail, n = 1)
#  A   B   C   D 
#"a" "b" "c" "d" 

另一个,safer and potentially faster variant的想法是使用vapply

vapply(mylist, tail, FUN.VALUE = character(1), n = 1)
# or a little shorter
# vapply(mylist, tail, "", 1)

(另一个)基准测试

set.seed(1)
mylist <- replicate(1e5, list(sample(letters, size = runif(1, 1, length(letters)))))

benchmark <- microbenchmark(
  f1 = {myfunction(mylist)},
  f2 = {sapply(mylist, function(l) l[length(l)])},
  f3 = {vapply(mylist, function(l) l[length(l)], "")},
  f4 = {sapply(mylist, tail, 1)},
  f5 = {vapply(mylist, tail, "", 1)},
  f6 = {mapply("[", mylist, lengths(mylist))},
  f7 = {mapply("[[", mylist, lengths(mylist))}, # added this out of curiosity
  f8 = {unlist(mylist)[cumsum(lengths(mylist))]},
  times = 100L
)

autoplot(benchmark)

此处结果相同:Rich的unlist(mylist)[cumsum(lengths(mylist_long))]是迄今为止最快的。 sapplyvapply之间似乎没有真正的区别。 myfunction(),如OP的问题所定义。

enter image description here

#benchmark
#Unit: milliseconds
# expr         min          lq        mean     median          uq        max neval
#   f1 28797.26121 30462.16785 31836.26875 31191.7762 32950.92537 36586.5477   100
#   f2   106.34213   117.75074   127.97763   124.9191   134.82047   176.2058   100
#   f3    99.72042   106.87308   119.59811   113.9663   123.63619   465.5335   100
#   f4  1242.11950  1291.38411  1409.35750  1350.3460  1505.76089  1880.6537   100
#   f5  1189.22615  1274.48390  1366.07234  1333.8885  1418.75394  1942.2803   100
#   f6   112.27316   123.73429   132.39888   129.8220   138.33851   191.2509   100
#   f7   107.27392   118.19201   128.06681   123.1317   133.29827   208.8425   100
#   f8    28.03948    28.84125    31.19637    30.3115    32.94077    40.9624   100

答案 1 :(得分:3)

以注释中提出的解决方案为基准,我们发现使用unlist的Rich的提议是最快的。

通过检查代码并调整参数,我们可以使其更快。

tail的慢度在此处讨论:https://stackoverflow.com/a/37238415/2270475

关于OP的示例数据:

library(microbenchmark)
microbenchmark(
  r2evans = sapply(mylist, function(l) l[length(l)]),
  markus  = sapply(mylist, tail, 1),
  Rich1   = mapply("[", mylist, lengths(mylist)),
  Rich2   = unlist(mylist)[cumsum(lengths(mylist))],
  markus2 = vapply(mylist, tail, character(1), 1),
  mm      = .Internal(unlist(mylist,FALSE,FALSE))[cumsum(lengths(mylist,FALSE))],
  unit = "relative"
)
# Unit: relative
#     expr       min        lq      mean    median        uq         max neval
#  r2evans 16.083333 12.764706 25.545957 12.368421 13.133333 122.1428571   100
#   markus 82.333333 59.294118 50.937673 60.342105 60.644444  10.2253968   100
#    Rich1 19.583333 15.294118 13.368047 15.394737 15.622222   2.7492063   100
#    Rich2  4.166667  3.705882  3.211045  3.789474  3.911111   0.7650794   100
#  markus2 73.166667 53.176471 44.669822 50.263158 54.155556  10.4857143   100
#       mm  1.000000  1.000000  1.000000  1.000000  1.000000   1.0000000   100

在1000倍以上的列表上:

mylist_long <- do.call(c,replicate(1000,mylist,simplify = F))
length(mylist_long) # [1] 4000

microbenchmark(
  r2evans = sapply(mylist_long, function(l) l[length(l)]),
  markus  = sapply(mylist_long, tail, 1),
  Rich1   = mapply("[", mylist_long, lengths(mylist_long)),
  Rich2   = unlist(mylist_long)[cumsum(lengths(mylist_long))],
  markus2 = vapply(mylist_long, tail, character(1), 1),
  mm      = .Internal(unlist(mylist_long,FALSE,FALSE))[cumsum(lengths(mylist_long,FALSE))],
  unit = "relative"
)
# Unit: relative
#     expr       min        lq      mean    median        uq       max neval
#  r2evans  26.14882  27.20436  27.07436  28.13731  28.54701  27.23846   100
#   markus 679.57251 698.84828 668.00160 715.30180 674.71067 443.42502   100
#    Rich1  27.53607  28.80581  29.82736  29.00353  31.02343  38.79978   100
#    Rich2  22.39863  21.79129  20.41467  21.53371  20.70750  13.03032   100
#  markus2 667.97494 702.14882 676.91881 718.41899 696.11934 633.17181   100
#       mm   1.00000   1.00000   1.00000   1.00000   1.00000   1.00000   100