当S4对象具有包含环境的插槽时,直接在环境中分配值比将环境分配给变量然后执行分配要慢。我已经在基准测试中加入了一个例子:
setClass( "test_class", slots = list( a_slot = "environment" ) )
test_object = new( "test_class", a_slot = list2env( list( a = rep(NaN,10000) ) ) )
test_object2 = new( "test_class", a_slot = list2env( list( a = rep(NaN,10) ) ) )
a1 = function( test_object, i ){
test_object@a_slot$a[i] = 1
}
a2 = function( test_object, i ){
w = test_object@a_slot
w$a[i] = 1
}
a3 = function( test_obj_slot, i ){
test_obj_slot$a[i] = 1
}
microbenchmark(
test_object_a1 = a1( test_object, 1 ),
test_object_a2 = a2( test_object, 1 ),
test_object_a3 = a3( test_object@a_slot, 1 ),
test_object2_a1 = a1( test_object2, 1 ),
test_object2_a2 = a2( test_object2, 1 ),
test_object2_a3 = a3( test_object2@a_slot, 1 )
)
Unit: nanoseconds
expr min lq mean median uq max neval cld
test_object_a1 15873 16897 20258.31 17409 23680.5 41217 100 c
test_object_a2 1025 1282 1844.20 1537 1793.0 11265 100 a
test_object_a3 1024 1281 2289.62 1537 1921.5 20225 100 a
test_object2_a1 4609 5633 7256.03 6144 6657.0 28161 100 b
test_object2_a2 1025 1537 1787.85 1793 1793.5 11264 100 a
test_object2_a3 769 1281 2143.69 1537 1793.0 21248 100 a
与创建新变量(a1()
)或将环境作为函数参数(a2()
)传递相比,直接赋值(a3()
)的速度降低似乎不是由于[<-
的开销,因为更改环境中的向量大小(test_object2
与test_object1
)会更改a1()
和{{1}之间速度差异的大小}}。
我怀疑直接分配需要R来复制对象,尽管环境不应该被复制。有人知道这里发生了什么吗?