Question

我必须使用物理公式进行一些模拟。在一个公式中，有许多变量。我想用 100 个样本改变这些变量。对于每个样本，我必须使用所有组合进行计算。简化的 for 循环更好地解释了我想要做什么：

b = max()/2.0

如您所见，当要改变的参数数量（每个参数 100 个样本）增加时，模拟数量呈指数增加，计算时间也会增加。我必须改变 10-15 个参数并计算所有参数组合的“值”。有什么办法（完全）可以避免这些循环？在处理这样的大型计算时，有什么好的编程习惯？

Answer 1

您所做的本质上是使用一些给定的操作创建一个大型多维排列数组，并将结果展平。

可以使用 outer 创建置换数组：例如，outer(a, b, `+`) 是 a[i] + b[j] 的所有成对组合的数组。整个数组由（注意运算符优先级！）给出：

array = outer(outer(a, outer(outer(b, c, `/`), d), `+`), f, `-`)

（或者，outer(a, b) 与 outer(a, b, `*`) 相同，也可以写成 a %o% b。）

要展平数组，请使用 as.vector:

value = as.vector(array)

结果与使用 expand.grid 的结果相同。不同之处在于使用 expand.grid 更具可读性：

value = with(expand.grid(a = a, b = b, c = c, d = d, f = f), a + b / c * d - f)

...但明显慢，并且使用很多内存。

我们可以通过创建自定义运算符来提高数组排列的可读性：

make_outer = function (f) function (a, b) outer(a, b, f)
`%o+%` = make_outer(`+`)
`%o-%` = make_outer(`-`)
`%o/%` = make_outer(`/`)

value = as.vector(a %o+% ((b %o/% c) %o% d) %o-% f)

Answer 2

我会在计算前使用 expand.grid 定义所有组合：

set.seed(3)
a = rnorm(10)
b = rnorm(10)
c = rnorm(10)
d = rnorm(10)
f = rnorm(10)

# find all possible combinations
sets <- expand.grid(
  a = a,
  b = b,
  c = c,
  d = d,
  f = f
)

# calculations is quick and vectorised
value <-  sets$a + sets$b / sets$c * sets$d - sets$f
head(value)
#> [1] -0.58891116  0.08049653  0.63181047 -0.77910963  0.56880508  0.40314620

没有必要，但在 tidyverse 中看起来更好看

library(tidyverse)
sets %>% 
  mutate(value = a + b / c * d - f) %>% 
  as_tibble() # just for nicer printing
#> # A tibble: 100,000 x 6
#>          a      b      c     d     f   value
#>      <dbl>  <dbl>  <dbl> <dbl> <dbl>   <dbl>
#>  1 -0.962  -0.745 -0.578 0.901 0.787 -0.589 
#>  2 -0.293  -0.745 -0.578 0.901 0.787  0.0805
#>  3  0.259  -0.745 -0.578 0.901 0.787  0.632 
#>  4 -1.15   -0.745 -0.578 0.901 0.787 -0.779 
#>  5  0.196  -0.745 -0.578 0.901 0.787  0.569 
#>  6  0.0301 -0.745 -0.578 0.901 0.787  0.403 
#>  7  0.0854 -0.745 -0.578 0.901 0.787  0.458 
#>  8  1.12   -0.745 -0.578 0.901 0.787  1.49  
#>  9 -1.22   -0.745 -0.578 0.901 0.787 -0.846 
#> 10  1.27   -0.745 -0.578 0.901 0.787  1.64  
#> # … with 99,990 more rows

^{由 reprex package (v1.0.0) 于 2021 年 3 月 29 日创建}

如何避免我非常大的嵌套 for 循环？

2 个答案: