Question

所以我有一个运行MCMC算法的大功能。我相信大部分昂贵的操作是大型矩阵的乘法，但这个Rprof输出相当令人困惑。

$by.self
                self.time self.pct total.time total.pct
"<Anonymous>"      328.90    81.84     329.34     81.95
"fprod"             46.16    11.49     376.02     93.57
"Dikin_Walk"         7.42     1.85     401.32     99.86
"as.vector"          5.98     1.49      57.56     14.32
".External"          2.54     0.63       2.54      0.63
"-"                  1.84     0.46       1.84      0.46
"H_x"                1.16     0.29     225.82     56.19
"fcrossprod"         1.14     0.28     226.12     56.27

编辑：以下是我在大包装函数中定义的3个函数：

## first, augment A | b 
A_b <- cbind (b, A)

## H(x) is the hessian
H_x <- function(x) {
  D <- as.vector(1/(A_b[,1] - fprod(A_b[,-1], x)))
  D_squared <- fdiag(D^2)
  return(fcrossprod(A, fprod(D_squared, A)))  
}

## D(x) is the diagonalized matrix of the log-barrier function of Ax <= b
D_x <- function(x) {

  D <- as.vector(1/(A_b[,1] - fprod(A_b[,-1], x)))
  return(fdiag(D))
} 

## checks whether a point z is in Ellip(x)
ellipsoid <- function(z, x) {

  ## as.numeric converts the expression into an atom, so we get boolean
  return( as.numeric(fcrossprod(z-x, fprod(H_x(x), (z-x)))) <= r^2)

}

fdiag，fcrossprod和fprod是其RcppArmEigen版本的所有R版本。我使用它们是因为它们要快得多。

主要算法：

> for (i in 1:n) {
>   
>    zeta <- rnorm(length(b), 0, 1)
>    zeta <- r * zeta / sqrt(as.numeric(fcrossprod(zeta,zeta)))
>  
>    rhs <- fcrossprod(A, fprod(D_x(current.point), zeta))
>    
>    ## DONE
>    
>    y <- fprod(fsolve(H_x(current.point)), rhs) 
>    y <- y + current.point
>    
>
>    while(!ellipsoid(current.point, y)) {
>    zeta <- rnorm(length(b), 0, 1)
>      
>     ## normalise to be on the m- unit sphere
>     ## and then compute lhs as a m-vector
>     zeta <- r * zeta / sqrt(sum(zeta * zeta))
>     
>     
>     rhs <- fcrossprod(A, fprod(D_x(current.point), zeta))
>     
>     ## 
>     y <- fprod(fsolve(H_x(current.point)), rhs)  
>     y <- y + current.point
>     
>
>     if(ellipsoid(current.point, y)) {
>      
>       probability <- min(1, sqrt(fdet(fprod(fsolve(H_x(current.point)),H_x(y))      )))
>        
>        
>       bool <- sample(c(TRUE, FALSE), 1, prob = c(probability, 1-?>probability))
>       if(bool) {
>         break
>       } 
>     }
>   }

这是by.total输出：

$by.total
                      total.time total.pct self.time self.pct
"Dikin_Walk"              401.32     99.86      7.42     1.85
"fprod"                   376.02     93.57     46.16    11.49
"<Anonymous>"             329.34     81.95    328.90    81.84
"cbind"                   268.58     66.83      0.04     0.01
"fcrossprod"              226.12     56.27      1.14     0.28
"H_x"                     225.82     56.19      1.16     0.29
"fsolve"                  203.82     50.72      0.14     0.03
"ellipsoid"               126.30     31.43      0.56     0.14
"fdet"                     64.84     16.13      0.02     0.00
"as.vector"                57.56     14.32      5.98     1.49
"fdiag"                    35.68      8.88      0.50     0.12

fprod定义为：

prodCpp <- 'typedef Eigen::Map<Eigen::MatrixXd> MapMatd;
const MapMatd B(as<MapMatd>(BB));
const MapMatd C(as<MapMatd>(CC));
return wrap(B * C);'

fprod <- cxxfunction(signature(BB = "matrix", CC = "matrix"),
                      prodCpp, "RcppEigen")

Answer 1

<Anonymous>指的是匿名（未命名）函数。如果您在循环中运行此类函数，则大多数时间通常都会花在此函数中。

显然A_b是一个矩阵，x是一个向量。使用矩阵代数而不是循环：

A_b <- matrix(1:16, 4)
x <- 1:3
D <- apply(A_b, 1, function(row) {1 / (row[1] - sum(row[-1] * x))})

D1 <- as.vector(1/(A_b[,1] - A_b[,-1] %*% x))

identical (D, D1)
#[1] TRUE

修改

匿名函数位于fprod：
的Rcpp魔法中
B <- matrix(rnorm(1e6),1e3) C <- matrix(rnorm(1e6),1e3) Rprof() for (i in 1:30) BC <- fprod(B, C) Rprof(NULL) summaryRprof() #$by.self # self.time self.pct total.time total.pct #"<Anonymous>" 4.24 100 4.24 100 # #$by.total # total.time total.pct self.time self.pct #"<Anonymous>" 4.24 100 4.24 100 #"fprod" 4.24 100 0.00 0 # #$sample.interval #[1] 0.02 # #$sampling.time #[1] 4.24

你的大部分时间花在矩阵乘法上。您可以从优化的BLAS中受益，例如，您可以尝试OpenBLAS。

Answer 2

首先，忽略＆＃34;自我时间＆＃34;，因为＆＃34;总时间＆＃34;包括加上callees。如果你花费任何时间不需要，你通过调用函数而不是通过运算来更有可能这样做。**

其次，不要看那个。 Rprofile生成堆栈跟踪文件。只看几个随机选择的。如果一个函数负责80％的时间，你将在大约四分之五的堆栈跟踪中看到它。更重要的是，你会看到谁在呼唤它，你会看到它正在呼唤谁，以便花时间。简单的数字不会告诉你。对堆栈跟踪进行排序也不会告诉您。

如果它给出了调用的行号，那就更好了，但事实并非如此。即便如此，只显示这些功能仍然非常有用。

** Profilers仅显示＆＃34;自我时间＆＃34;因为他们总是拥有，而且因为所有其他人都这样做，很少有人意识到这只是一种分心。如果一个函数位于堆栈跟踪的终点，那么它将在＆＃34; self time＆＃34;中。无论哪种方式都包含在＆＃34;包容时间＆＃34;。

解释Rprofile输出：这个<anonymous>函数是什么？

2 个答案: