I'm currently using S4 classes in a package with Rcpp and found a performance loss compared to an older version (less formal and less general) of my code to do the same task. I was wondering if there was a bottleneck caused by calling the S4.slot()
function multiple times, so I created this example as a benchmark:
First define some S4 classes similar to what I'm using: collection
is a class with a itemlist
and some other meta information. item
is a class with subitems
that is a numeric vector and some other meta information.
setClass("collection", representation(itemlist = "list",
element1 = "numeric",
element2 = "character"))
setClass("item", representation(subitems = "numeric",
subitemnames = "character"))
a_item = new("item", subitems = c(1,2),
subitemnames = c("A","B"))
a_collection = new("collection", itemlist = replicate(1000000, a_item),
element1 = 123, element2 = "something")
I'm trying to operate on the a collection
object using all of its itemlist
. So I created an Rcpp function to sum
all the elements of each itemlist
and add them together.
cfn_s4 <-
'double operation(S4 collection, NumericVector z){
double res = 0;
List all_items = collection.slot("itemlist");
for(int i = 0; i < z.size(); i++){
S4 this_item = all_items[i];
NumericVector to_add = this_item.slot("subitems");
res = res + sum(to_add)*z[i];
}
return(res);
}'
I also created another version that does the same, but takes a list
as argument (no S4 is used here). Some pre-processing is required, I lose some meta information, but there is no need to call S4.slot
here.
cfn_list <-
'double operation_list(List collection, NumericVector z){
double res = 0;
for(int i = 0; i < z.size(); i++){
NumericVector to_add = collection[i];
res = res + sum(to_add)*z[i];
}
return(res);
}'
Here are the benchmarks:
library(Rcpp)
library(rbenchmark)
fn_s4 <- cppFunction(cfn_s4)
fn_list <- cppFunction(cfn_list)
set.seed(1)
z <- rnorm(1000)
ready_list <- map(a_collection@itemlist, "subitems")
benchmark(fn_s4(a_collection, z),
fn_list(ready_list, z))
test replications elapsed relative user.self sys.self user.child sys.child
2 fn_list(ready_list, z) 100 0.00 NA 0.00 0 NA NA
1 fn_s4(a_collection, z) 100 0.01 NA 0.02 0 NA NA
Even for a collection
object with a list of a million elements, the times made me think that even with a million S4.slot
calls, there was no performance hit in the function, so the cost of S4.slot
is negligible. Is this actually the case? I thought using S4 classes, specially when Rcpp has no idea what they actually are, should make the code slower due to some possible type conversions required.
Can I trust this benchmark and assume there is no significant performance loss caused by calling S4.slot
even millions of times? Can you see something wrong with my functions leading to unfair comparisons?