从Efficient R programming the byte compiler和R docment r byte compiler中,我了解到cmpfun
可用于将纯R
函数编译为字节码以加快速度,而enableJIT
则可加速通过启用just-in-time
编译来实现。
因此,我决定使用以下代码像the first link一样进行基准测试:
library("compiler")
library("rbenchmark")
enableJIT(3)
my_mean = function(x) {
total = 0
n = length(x)
for (each in x)
total = total + each
total / n
}
cmp_mean = cmpfun(my_mean, list(optimize = 3))
## Generate some data
x = rnorm(100000)
benchmark(my_mean(x), cmp_mean(x), mean(x), columns = c("test", "elapsed", "relative"), order = "relative", replications = 5000)
不幸的是,结果与the first link所示的结果不同。 my_mean
的性能甚至优于cmp_mean
:
test elapsed relative
3 mean(x) 1.468 1.000
1 my_mean(x) 35.402 24.116
2 cmp_mean(x) 36.817 25.080
我不知道发生了什么事。
编辑:
我计算机上的R
版本是3.5.2
。
操作系统debian 9.8
。我的计算机上的每个软件都是最新的debian
提供的稳定资源。
linux
内核版本4.9.0-8-amd64
。
Eidt5:
我重写了脚本以测试optimize
和JIT
的不同组合:
#!/usr/bin/env Rscript
library("compiler")
library("microbenchmark")
library("rlist")
my_mean = function(x) {
total = 0
n = length(x)
for (each in x)
total = total + each
total / n
}
do_cmpfun = function(f, f_name, optimization_level) {
cmp_f = cmpfun(f, list(optimize = optimization_level))
list(cmp_f, f_name, optimize = optimization_level)
}
do_benchmark = function(f, f_name, optimization_level, JIT_level, x) {
result = summary(microbenchmark(f(x), times = 1000, unit = "us", control = list(warmup = 100)))
data.frame(fun = f_name, optimize = optimization_level, JIT = JIT_level, mean = result$mean)
}
means = list(list(mean, "mean", optimize = -1), list(my_mean, "my_mean", optimize = -1))
for (optimization_level in 0:3)
means = list.append(means, do_cmpfun(my_mean, "my_mean", optimization_level))
# Generate some data
x = rnorm(100000)
# Benchmark in different JIT levels
result = c()
for (JIT_level in 0:3) {
enableJIT(JIT_level)
for (f in means) {
result = rbind(result, do_benchmark(f[[1]], f[[2]], f[[3]], JIT_level, x))
}
}
# Sort result
sorted_result = result[order(result$mean), ]
rownames(sorted_result) = NULL
print("Unit = us, optimize = -1 means it is not processed by cmpfun")
print(sorted_result)
我在运行R脚本之前运行了sudo cpupower frequency-set --governor performance
,并得到了这个信息:
[1] "Unit = us, optimize = -1 means it is not processed by cmpfun"
fun optimize JIT mean
1 mean -1 2 229.1841
2 mean -1 1 229.3910
3 mean -1 3 236.3680
4 mean -1 0 252.9416
5 my_mean -1 2 5242.0413
6 my_mean 3 0 5279.9710
7 my_mean 2 2 5297.5323
8 my_mean 2 1 5327.0324
9 my_mean -1 1 5333.6941
10 my_mean 3 1 5336.4559
11 my_mean 2 0 5362.6644
12 my_mean 3 3 5410.1963
13 my_mean 2 3 5414.4616
14 my_mean -1 3 5418.3823
15 my_mean 3 2 5437.3233
16 my_mean 1 2 9947.7897
17 my_mean 1 1 10101.6464
18 my_mean 1 3 10204.3253
19 my_mean 1 0 10323.0782
20 my_mean 0 0 26557.3808
21 my_mean 0 2 26728.5222
22 my_mean -1 0 26901.4200
23 my_mean 0 3 26984.5200
24 my_mean 0 1 27060.6188
但是,我update-alternative
将libblas.so.3
和liblapack.so.3
openblas 0.2.19-3
,my_mean
和optimize = 3
和{{1 }}成为性能最好的({{1}除外):
JIT = 0
与mean
相同:
[1] "Unit = us, optimize = -1 means it is not processed by cmpfun"
fun optimize JIT mean
1 mean -1 0 228.9361
2 mean -1 1 229.1223
3 mean -1 2 233.9757
4 mean -1 3 241.7835
5 my_mean 3 0 5246.8089
6 my_mean -1 1 5261.3951
7 my_mean -1 2 5330.6310
8 my_mean 2 3 5362.2055
9 my_mean 3 1 5400.9983
10 my_mean 2 0 5418.7674
11 my_mean 2 1 5460.8133
12 my_mean 3 3 5464.8280
13 my_mean -1 3 5520.7021
14 my_mean 2 2 5591.7352
15 my_mean 3 2 5610.6446
16 my_mean 1 3 10244.2832
17 my_mean 1 0 10274.7504
18 my_mean 1 1 10311.6423
19 my_mean 1 2 10735.6449
20 my_mean 0 2 26904.1858
21 my_mean -1 0 26961.0536
22 my_mean 0 0 27115.8191
23 my_mean 0 3 27538.7224
24 my_mean 0 1 28133.6159
答案 0 :(得分:2)
虽然我还没有弄清楚为什么 JIT 编译没有加速您的代码,但我们可以通过使用 Rcpp 包进行编译来加速相同的函数。
这样做会得到以下结果(其中 mean_cpp 是使用 Rcpp 编写和编译的函数:
test elapsed relative
4 mean_cpp(x) 0.67 1.000
3 mean(x) 1.00 1.493
1 my_mean(x) 14.00 20.896
2 cmp_mean(x) 14.50 21.642
生成这个函数的代码如下。
library("compiler")
library("rbenchmark")
library("Rcpp")
enableJIT(3)
my_mean = function(x) {
total = 0
n = length(x)
for (each in x)
total = total + each
total / n
}
cmp_mean = cmpfun(my_mean, list(optimize = 3))
#we can also write this same function using the Rcpp package
cppFunction('double mean_cpp(NumericVector x) {
double total = 0;
int n = x.size();
for(int i = 0; i < n; i++) {
total += x[i];
}
return total / n;
}')
#run once to compile
mean_cpp(c(1))
## Generate some data
x = rnorm(100000)
benchmark(my_mean(x), cmp_mean(x), mean(x), mean_cpp(x),
columns = c("test", "elapsed", "relative"),
order = "relative", replications = 5000)