Question

我的S4类有一个多次调用的方法。我注意到执行时间比独立调用类似函数时要慢得多。所以我在我的类中添加了一个带有“function”类型的插槽，并使用该函数而不是方法。下面的示例显示了两种执行此操作的方法，它们都比相应的方法运行得快得多。此外，该示例表明该方法的较低速度不是由于方法必须从类中检索数据，因为即使它们也这样做，函数也会更快。

当然，这种做事方式并不理想。我想知道是否有办法加速方法调度。有什么建议吗？

    setClass(Class = "SpeedTest", 
      representation = representation(
        x = "numeric",
        foo1 = "function",
        foo2 = "function"
      )
    )

    speedTest <- function(n) {
      new("SpeedTest",
        x = rnorm(n),
        foo1 = function(z) sqrt(abs(z)),
        foo2 = function() {}
      )
    }

    setGeneric(
      name = "method.foo",
      def = function(object) {standardGeneric("method.foo")}
    )
    setMethod(
      f = "method.foo", 
      signature = "SpeedTest",
      definition = function(object) {
        sqrt(abs(object@x))
      }
    )

    setGeneric(
      name = "create.foo2",
      def = function(object) {standardGeneric("create.foo2")}
    )
    setMethod(
      f = "create.foo2", 
      signature = "SpeedTest",
      definition = function(object) {
        z <- object@x
        object@foo2 <- function() sqrt(abs(z))

        object
      }
    )

    > st <- speedTest(1000)
    > st <- create.foo2(st)
    > 
    > iters <- 100000
    > 
    > system.time(for (i in seq(iters)) method.foo(st)) # slowest by far
       user  system elapsed 
       3.26    0.00    3.27 

    > # much faster 
    > system.time({foo1 <- st@foo1; x <- st@x; for (i in seq(iters)) foo1(x)}) 
       user  system elapsed 
      1.47    0.00    1.46 

    > # retrieving st@x instead of x does not affect speed
    > system.time({foo1 <- st@foo1; for (i in seq(iters)) foo1(st@x)}) 
       user  system elapsed 
       1.47    0.00    1.49 

    > # same speed as foo1 although no explicit argument
    > system.time({foo2 <- st@foo2; for (i in seq(iters)) foo2()}) 
       user  system elapsed 
       1.44    0.00    1.45 

     # Cannot increase speed by using a lambda to "eliminate" the argument of method.foo
     > system.time({foo <- function() method.foo(st); for (i in seq(iters)) foo()})  
        user  system elapsed 
        3.28    0.00    3.29

Answer 1

成本是在方法查找中，在您的时间的每次迭代中从头开始。通过计算方法调度一次可以将其短路

METHOD <- selectMethod(method.foo, class(st))
for (i in seq(iters)) METHOD(st)

这个（更好的方法查找）将是一个非常有趣且值得一试的项目;在其他动态语言中学到了宝贵的经验教训，例如维基百科dynamic dispatch页面上提到的内联缓存。

我想知道您进行多次方法调用的原因是因为数据表示和方法的矢量化不完整吗？

Answer 2

这对您的问题没有直接帮助，但使用microbenchmark软件包对这类内容进行基准测试要容易得多：

f <- function(x) NULL

s3 <- function(x) UseMethod("s3")
s3.integer <- function(x) NULL

A <- setClass("A", representation(a = "list"))
setGeneric("s4", function(x) standardGeneric("s4"))
setMethod(s4, "A", function(x) NULL)

B <- setRefClass("B")
B$methods(r5 = function(x) NULL)

a <- A()
b <- B$new()

library(microbenchmark)
options(digits = 3)
microbenchmark(
  bare = NULL,
  fun = f(),
  s3 = s3(1L),
  s4 = s4(a),
  r5 = b$r5()
)
# Unit: nanoseconds
#  expr   min    lq median    uq   max neval
#  bare    13    20     22    29    36   100
#   fun   171   236    270   310   805   100
#    s3  2025  2478   2651  2869  8603   100
#    s4 10017 11029  11528 11905 36149   100
#    r5  9080 10003  10390 10804 61864   100

在我的电脑上，裸呼叫大约需要20 ns。将它包装在函数中会增加额外的200 ns - 这是创建函数执行发生的环境的成本。 S3方法调度大约增加3μs，S4 / ref类增加大约12μs。

S4方法调度慢吗？

2 个答案: