@Moody_Mudskipper和我讨论了负整数索引和正整数索引的相对优点。考虑一个包含1行的大矩阵:
m <- matrix(integer(2^20), 1)
负向或正向索引向量是获得除最后一列以外的所有(时间)效率更高的方式吗?
我们的候选人到目前为止
neg <- function(m) m[,-ncol(m)]
pos <- function(m) m[,seq_len(ncol(m) - 1)]
对于矢量:v <- integer(2^28)
neg_v <- function(v) v[-length(v)]
pos_v <- function(v) v[seq_len(length(v) - 1)]
如果对一种方法有好处,它是否会延续到更一般的情况,你想要除了一些任意索引之外的一切,i
?
我们的灵感部分是通过讨论获得最后一个元素的最有效方式here
答案 0 :(得分:3)
[
非常多才多艺。您需要以较低的性能成本付费。我仍然鼓励你用它来完成这项任务。如果算法中此步骤的执行对您至关重要,那么您使用的是错误的语言。
矢量的替代方案:
len_v <- function(v) `length<-`(v, length(v) - 1)
对于矩阵(注意这不会丢弃尺寸):
dim_m <- function(m) matrix(`length<-`(m, length(m) - nrow(m)), nrow = nrow(m))
基准:
[1] 10
Unit: nanoseconds
expr min lq mean median uq max neval cld
neg_v(v) 3841 4098 4747.78 4353 4609 36363 100 a
pos_v(v) 4866 5378 5572.30 5633 5634 7426 100 a
neg(m) 4353 4866 5165.02 5122 5378 6914 100 a
pos(m) 5121 5378 5792.41 5634 5890 15364 100 a
len_v(v) 768 1024 1065.31 1024 1280 2049 100 a
dim_m(m) 2817 3329 24219.64 3585 3841 2063960 100 a
<snip>
[1] 20
Unit: microseconds
expr min lq mean median uq max neval cld
neg_v(v) 3125.901 3210.406 5997.0644 4080.9305 4726.878 102998.274 100 b
pos_v(v) 4510.752 4572.721 6308.3921 5097.8025 5980.619 104768.516 100 b
neg(m) 3369.172 3438.952 6286.7153 4657.4825 4976.423 102606.735 100 b
pos(m) 4499.996 4518.306 4895.5423 4534.5670 5059.904 6577.785 100 ab
len_v(v) 562.852 591.020 859.3246 630.3275 680.006 2381.493 100 a
dim_m(m) 1072.184 1120.070 3528.7498 1167.1880 2377.523 99962.255 100 ab
答案 1 :(得分:1)
两种解决方案都不依赖于输入的大小。随着对象的大小变大,正索引向量获胜。这可能有点令人惊讶,因为我们假设负整数索引立即将事物发送到C,并且正整数索引在seq_len
中进行了一次额外的R调用,但这是一个非常快速的原语。但请注意,我们甚至没有谈论因子2.
对于一般情况,但是,正向索引向量必须使用两个seq
调用构建,即使是基元seq_len
和seq.int
,也可以在任何时候提高效率索引丢失了。
这是我们的初始候选人的微基准数据,编辑包括@ DavidArenburg的建议,head(v, -1)
用于比较,适用于矢量问题,但不适用于矩阵问题:
for (size in 10:20){
v <- integer(2^size)
m <- matrix(v, 1)
print(size)
print(microbenchmark(neg_v(v), pos_v(v), head(v, -1), neg(m), pos(m)))
}
[1] 10
Unit: microseconds
expr min lq mean median uq max neval cld
neg_v(v) 7.665 7.9435 8.41404 8.1680 8.4995 20.977 100 a
pos_v(v) 7.684 7.9900 8.94586 8.2090 8.4895 57.970 100 ab
head(v, -1) 18.371 19.2280 20.77184 19.7155 20.6470 58.672 100 c
neg(m) 9.013 9.3440 9.99337 9.6675 10.0070 23.603 100 b
pos(m) 8.457 8.9640 9.36145 9.1965 9.5035 16.494 100 ab
[1] 11
Unit: microseconds
expr min lq mean median uq max neval cld
neg_v(v) 13.905 14.2010 15.36906 14.3640 14.6355 25.910 100 a
pos_v(v) 14.140 14.4275 15.83083 14.5965 14.8975 55.185 100 ab
head(v, -1) 25.068 25.8605 28.92266 26.6320 27.6395 79.200 100 c
neg(m) 15.769 16.1215 17.52703 16.3175 16.7270 51.691 100 b
pos(m) 15.149 15.5375 16.68138 15.7605 16.0315 49.755 100 ab
[1] 12
Unit: microseconds
expr min lq mean median uq max neval cld
neg_v(v) 26.190 27.3690 31.18507 29.4325 31.4510 48.557 100 a
pos_v(v) 25.296 28.4645 35.51150 29.3030 32.1255 421.324 100 a
head(v, -1) 37.928 39.8455 48.19410 40.9510 46.2560 375.465 100 b
neg(m) 29.437 31.3505 35.00288 32.6420 35.0445 69.953 100 a
pos(m) 26.738 28.9945 37.35408 29.4220 34.1825 462.919 100 ab
[1] 13
Unit: microseconds
expr min lq mean median uq max neval cld
neg_v(v) 46.506 51.6525 66.96329 56.8475 58.5865 442.329 100 a
pos_v(v) 48.263 56.0625 70.38045 56.4500 58.1370 498.686 100 a
head(v, -1) 58.836 66.9540 75.79951 68.0400 71.4055 179.082 100 a
neg(m) 51.101 57.8895 153.67522 61.9160 63.4910 8280.099 100 a
pos(m) 49.914 55.1205 65.61441 55.7930 60.3385 395.184 100 a
[1] 14
Unit: microseconds
expr min lq mean median uq max neval cld
neg_v(v) 91.234 111.0975 201.0988 112.6285 114.5950 8577.642 100 a
pos_v(v) 95.394 110.4745 213.2071 111.5600 113.9150 7812.403 100 a
head(v, -1) 107.164 122.1795 146.4730 123.7635 127.4225 567.144 100 a
neg(m) 101.615 120.1835 141.3496 121.6715 124.0385 564.711 100 a
pos(m) 98.133 106.4320 115.3708 108.5990 109.5940 563.723 100 a
[1] 15
Unit: microseconds
expr min lq mean median uq max neval cld
neg_v(v) 183.586 572.8205 752.0629 590.8605 668.0415 14919.153 100 a
pos_v(v) 207.188 466.6390 605.7880 503.4750 541.7855 8832.833 100 a
head(v, -1) 210.091 512.0090 540.2439 537.2850 563.8335 766.843 100 a
neg(m) 219.635 594.0065 689.7076 610.0625 632.1530 7904.021 100 a
pos(m) 214.892 400.7375 523.8320 412.9320 485.0875 8831.092 100 a
[1] 16
Unit: microseconds
expr min lq mean median uq max neval cld
neg_v(v) 363.163 1024.6745 1326.5858 1046.4375 1120.8760 10668.002 100 b
pos_v(v) 380.258 884.8680 981.2985 897.1980 960.6395 6879.475 100 a
head(v, -1) 388.339 916.7945 970.3009 937.1235 966.2430 1707.400 100 a
neg(m) 394.158 1063.4785 1207.5574 1081.2935 1279.0390 7388.591 100 ab
pos(m) 386.093 724.6455 920.7682 733.4520 829.8680 7127.476 100 a
[1] 17
Unit: microseconds
expr min lq mean median uq max neval cld
neg_v(v) 734.334 2011.042 2156.787 2054.732 2209.009 10029.207 100 a
pos_v(v) 761.919 1745.547 1828.906 1772.859 1883.327 7059.740 100 a
head(v, -1) 791.387 1786.015 1940.802 1822.699 1955.429 8123.136 100 a
neg(m) 794.872 1968.399 3905.147 2109.351 2315.058 153287.378 100 a
pos(m) 767.494 1421.462 1516.872 1440.947 1556.770 7150.461 100 a
[1] 18
Unit: milliseconds
expr min lq mean median uq max neval cld
neg_v(v) 1.481940 2.240029 4.140381 4.110589 4.528180 11.454875 100 b
pos_v(v) 1.560234 1.985711 3.588765 3.484492 3.662266 10.422674 100 b
head(v, -1) 1.588862 2.034140 3.484662 3.544228 3.805630 9.757330 100 ab
neg(m) 1.597898 2.120536 3.583514 3.937739 4.329667 9.443408 100 b
pos(m) 1.550572 1.891935 2.832852 2.846605 3.088712 8.821962 100 a
[1] 19
Unit: milliseconds
expr min lq mean median uq max neval cld
neg_v(v) 3.252966 4.080474 6.994025 4.323296 6.660327 160.33926 100 a
pos_v(v) 3.144045 3.923489 5.330351 4.050275 7.129475 12.49244 100 a
head(v, -1) 3.233204 3.989537 5.661008 4.246967 7.526656 12.98025 100 a
neg(m) 3.507776 4.320612 7.669867 4.652584 8.542794 157.08516 100 a
pos(m) 3.114030 3.685220 6.276472 3.819997 4.896968 156.17061 100 a
[1] 20
Unit: milliseconds
expr min lq mean median uq max neval cld
neg_v(v) 8.089944 12.161850 18.88741 13.185043 14.51066 171.9262 100 a
pos_v(v) 7.918274 8.419380 16.07148 11.899635 13.33752 173.2544 100 a
head(v, -1) 7.938086 8.478848 17.49432 12.432816 13.54419 165.1829 100 a
neg(m) 8.645179 11.077800 20.55285 13.502900 14.43028 171.8824 100 a
pos(m) 7.334725 7.565886 14.32296 8.394255 12.04803 166.6472 100 a
一般情况
neg_mid <- function(m, i) m[,-i]
pos_mid <- function(m, i) m[c(seq_len(i-1), seq.int(i + 1, ncol(m)))]
neg_mid_v <- function(v, i) v[-i]
pos_mid_v <- function(v, i) v[c(seq_len(i - 1), seq.int(i + 1, length(v)))]
for (size in 10:20){
v <- integer(2^size)
m <- matrix(v, 1)
i <- sample(2^size, 1)
print(size)
print(microbenchmark(neg_mid(m, i), pos_mid(m, i), neg_mid_v(v, i), pos_mid_v(v, i)))
}
[1] 10
Unit: microseconds
expr min lq mean median uq max neval cld
neg_mid(m, i) 8.256 8.6450 23.24169 8.9035 9.2690 1398.011 100 a
pos_mid(m, i) 11.559 11.8745 42.67081 12.3090 12.5665 2985.724 100 a
neg_mid_v(v, i) 7.262 7.6350 25.55930 7.9745 8.3010 1706.966 100 a
pos_mid_v(v, i) 11.162 11.6665 43.02460 11.9085 12.1270 3076.268 100 a
<snip>
[1] 20
Unit: milliseconds
expr min lq mean median uq max neval cld
neg_mid(m, i) 8.399471 9.239935 17.68447 13.61378 14.67457 183.7487 100 a
pos_mid(m, i) 11.217084 12.215356 19.00080 16.47771 17.76067 186.3762 100 a
neg_mid_v(v, i) 7.832412 10.067575 15.94740 13.14343 14.61393 187.9892 100 a
pos_mid_v(v, i) 10.890024 13.340772 20.99761 16.47525 18.06857 195.5897 100 a