如果是这样,为什么我们需要sapply
?
x <- list(a=1, b=1)
y <- list(a=1)
JSON <- rep(list(x,y),10000)
microbenchmark(sapply(JSON, function(x) x$a),
unlist(lapply(JSON, function(x) x$a)),
sapply(JSON, "[[", "a"),
unlist(lapply(JSON, "[[", "a"))
)
Unit: milliseconds
expr min lq median uq max neval
sapply(JSON, function(x) x$a) 25.22623 28.55634 29.71373 31.76492 88.26514 100
unlist(lapply(JSON, function(x) x$a)) 17.85278 20.25889 21.61575 22.67390 78.54801 100
sapply(JSON, "[[", "a") 18.85529 20.06115 21.53790 23.42480 38.56610 100
unlist(lapply(JSON, "[[", "a")) 11.33859 11.69198 12.25329 13.37008 27.81361 100
答案 0 :(得分:18)
除了运行lapply
之外,sapply
运行simplify2array
以尝试将输出拟合到数组中。为了确定是否可能,该函数需要检查所有单个输出是否具有相同的长度:这是通过代价为您看到的大部分时间差的代价unique(lapply(..., length))
来完成的:
b <- lapply(JSON, "[[", "a")
microbenchmark(lapply(JSON, "[[", "a"),
unlist(b),
unique(lapply(b, length)),
sapply(JSON, "[[", "a"),
sapply(JSON, "[[", "a", simplify = FALSE),
unlist(lapply(JSON, "[[", "a"))
)
# Unit: microseconds
# expr min lq median uq max neval
# lapply(JSON, "[[", "a") 14809.151 15384.358 15774.26 16905.226 24944.863 100
# unlist(b) 920.047 1043.719 1158.62 1223.091 8056.231 100
# unique(lapply(b, length)) 10778.065 11060.452 11456.11 12581.414 19717.740 100
# sapply(JSON, "[[", "a") 24827.206 25685.535 26656.88 30519.556 93195.751 100
# sapply(JSON, "[[", "a", simplify = FALSE) 14283.541 14922.780 15526.42 16654.058 26865.022 100
# unlist(lapply(JSON, "[[", "a")) 15334.026 16133.146 16607.12 18476.182 30080.544 100
答案 1 :(得分:10)
正如droopy和Roland所指出的,sapply
是lapply
的包装函数,旨在方便使用。 sapply
使用的simplify2array
慢于unlist
:
> microbenchmark(unlist(as.list(1:1000)), simplify2array(as.list(1:1000)), times=1000)
Unit: microseconds
expr min lq median uq max neval
unlist(as.list(1:1000)) 99.734 109.0230 113.912 118.3120 21343.92 1000
simplify2array(as.list(1:1000)) 892.712 931.0895 947.957 976.3125 22241.52 1000
此外,返回矩阵时,sapply
比其他基本函数慢,例如:
a <- list(c(1,2,3,4), c(1,2,3,4), c(1,2,3,4))
microbenchmark(t(do.call(rbind, lapply(a, function(x)x))), sapply(a, function(x)x))
Unit: microseconds
expr min lq median uq max neval
t(do.call(rbind, lapply(a, function(x) x))) 29.823 30.801 32.512 33.734 94.845 100
sapply(a, function(x) x) 57.201 58.179 59.156 60.134 111.956 100
但特别是在第二种情况下,sapply
更容易使用。