Question

我试图对我的数据进行分段线性转换。这是一个描述转换的示例表：

dat <- data.frame(x.low = 0:2, x.high = 1:3, y.low=c(0, 2, 3), y.high=c(2, 3, 10))
dat
#   x.low x.high y.low y.high
# 1     0      1     0      2
# 2     1      2     2      3
# 3     2      3     3     10

如果我定义x <- c(1.75, 2.5)，我希望转换后的值为2.75和6.5（我的元素将分别与dat的第2行和第3行匹配。）

我知道如何使用for循环解决此问题，遍历dat行并转换相应的值：

pw.lin.trans <- function(x, m) {
  out <- rep(NA, length(x))
  for (i in seq(nrow(m))) {
    matching <- x >= m$x.low[i] & x <= m$x.high[i]
    out[matching] <- m$y.low[i] + (x[matching] - m$x.low[i]) /
      (m$x.high[i] - m$x.low[i]) * (m$y.high[i] - m$y.low[i])
  }
  out
}
pw.lin.trans(x, dat)
# [1] 2.75 6.50

虽然这有效，但我觉得应该有一个更好的方法，将x值与dat行匹配，然后在一次计算中执行所有插值。有人可以指点我这个问题的非for - 循环解决方案吗？

Answer 1

尝试approx：

(xp <- unique(c(dat$x.low, dat$x.high)))
## [1] 0 1 2 3
(yp <- unique(c(dat$y.low, dat$y.high)))
## [1]  0  2  3 10
x <- c(1.75, 2.5)
approx(xp, yp, x)
## $x
## [1] 1.75 2.50
## 
## $y
## [1] 2.75 6.50

或approxfun（返回一个新函数）：

f <- approxfun(xp, yp)
f(x)
## [1] 2.75 6.50

一些基准：

set.seed(123L)
x <- runif(10000, min(xp), max(yp))
library(microbenchmark)
microbenchmark(
  pw.lin.trans(x, dat),
  approx(xp, yp, x)$y,
  f(x)
)
## Unit: microseconds
##                  expr      min       lq    median        uq      max neval
##  pw.lin.trans(x, dat) 3364.241 3395.244 3614.0375 3641.7365 6170.268   100
##   approx(xp, yp, x)$y  359.080  379.669  424.0895  453.6800  522.756   100
##                  f(x)  202.899  209.168  217.8715  232.3555  293.499   100

没有for循环或嵌套ifelse的分段线性变换

1 个答案: