Question

我有一个 n 列和 m 行的矩阵，以及一个 f 函数的列表。每个函数占用矩阵的一行，并返回一个值 p 。

通过 m 行矩阵生成 f 列的最佳方法是什么？

目前我正在这样做：

# create a random 5x5 matrix
m <- matrix(rexp(25, rate=.1), ncol=5)

# example functions, in reality more complex but with the same signature
fs <- list(function(xs) { return(mean(xs)) }, function(xs) { return(min(xs)) } )

# create a function which takes a function and applies it to each row of m
g <- function(f) { return(apply(m, 1, f)) }

# use lapply to make a call for each function in fs
# use do.call and cbind to reshape the output from a list of lists to a matrix
do.call("cbind", lapply(fs, g))

说明编辑：上面的代码确实有效，但是我想知道是否还有更优雅的方法。

Answer 1

使用base R，您可以在一行中完成它：

cbind(apply(m, 1, mean), apply(m, 1, min))

#          [,1]      [,2]
#[1,] 13.287748 5.2172657
#[2,]  5.855862 1.8346868
#[3,]  8.077236 0.4162899
#[4,] 10.422803 1.5899831
#[5,] 10.283001 2.0444687

这比do.call方法要快：

microbenchmark::microbenchmark(
  do.call("cbind", lapply(fs, g)),
  cbind(apply(m, 1, mean), apply(m, 1, min))
)

产生：

#Unit: microseconds
#                                       expr    min     lq     mean
#            do.call("cbind", lapply(fs, g)) 66.077 67.210 88.75483
# cbind(apply(m, 1, mean), apply(m, 1, min)) 57.771 58.903 67.70094
# median     uq     max neval
# 67.965 71.741 851.446   100
# 59.658 60.036 125.735   100

Answer 2

这就是我改编@patL的answer以获取功能列表的方式：

# create a random 5x5 matrix
m <- matrix(rexp(25, rate=.1), ncol=5)

# example functions, in reality more complex but with the same signature
fs <- list(function(xs) { return(mean(xs)) }, function(xs) { return(min(xs)) } )

# create a function which takes a function and applies it to each row of m
g <- function(f) { return(apply(m, 1, f)) }

# use sapply to make a call for each function in fs
# use cbind to reshape the output from a list of lists to a matrix
cbind(sapply(fs, g))

我正在使用它来对一组模型进行评分，例如：

# models is a list of trained models and m is a matrix of input data
g <- function(model) { return(predict(model, m)) }
# produce a matrix of model scores
cbind(sapply(models, g))

Answer 3

附带数据：

set.seed(11235813)

m <- matrix(rexp(25, rate=.1), ncol=5)

fs <- c("mean", "median", "sd", "max", "min", "sum")

你可以做：

sapply(fs, mapply, split(m, row(m)), USE.NAMES = T)

哪个返回：

          mean   median        sd      max       min      sum
[1,]  9.299471 3.531394 10.436391 26.37984 1.7293010 46.49735
[2,]  8.583419 2.904223 11.714482 28.75344 0.7925614 42.91709
[3,]  6.292835 4.578894  6.058633 16.92280 1.8387221 31.46418
[4,] 10.699276 5.688477 15.161685 36.91369 0.1049507 53.49638
[5,]  9.767307 2.748114 10.767438 24.66143 1.5677153 48.83653

注意：

与上面提出的两种方法相比，它是最慢的。

将函数列表应用于矩阵并作为R中的结果返回矩阵

3 个答案: