自我讲授全科医生和高斯回归我偶然发现了Nando关于这个主题的讲座(part 1和part 2)。因此,由于我对python的了解有限,我试图在R中重写他的script。但是,在产生置信区间的标准偏差时,我得到NaN
。在对问题进行了一些更彻底的检查之后,我发现主要区别在于np.linalg.solve()
和R'solve()
。那么问题是,对于这些类型的操作,哪个是合适的R求解器?
修改
在尝试攻击此问题时,我发现当solve(L, ker_x_x.test)
替换为forwardsolve(L, ker_x_x.test)
时,两段代码的结果部分匹配。我仍然无法模仿原始剧本的结果。
编辑2
我设法匹配结果。在Python版本中,矩阵Lk
使用np.linalg.solve()
计算,其中R脚本应为backsolve()
。要计算mu
向量“n”,您应该使用forwardsolve()
。
文档链接:
我提供了更多评论的代码:
# GPs Nando Style
# Simple GP Regression. Assumes a zero mean GP Prior.
# setwd("your directory")
# rm(list=ls())
# graphics.off()
# cat("\014")
# y <- 0.25*(x^2)
kernel <- function(sigma_f, l, x_i, x_j){
val <- (sigma_f^2)*exp((-1*(t(x_i - x_j)%*%(x_i - x_j)))/(2*l^2))
return(val)
}
N <- 10 # Number of training points
n <- 50 # Number of test points
s <- 10^(-5)
x <- runif(N, -5, 5)
y <- sin(0.9*x) + s*rnorm(N, mean = 0, sd = 1) # f(X) is sin(0.9*x)
ker_x_x <- matrix(0, nrow = length(x), ncol = length(x))
for (i in 1:length(x)) {
for (j in 1:length(x)) {
ker_x_x[i, j] <- kernel(1, 0.1, x[i], x[j])
}
}
L <- chol(ker_x_x + s*diag(N))
# points we are going to make predictions on
x_testSet <- seq(-5, 5, length.out = n)
# compute kernel
ker_x_x.test <- matrix(0, nrow = length(x), ncol = length(x_testSet))
for (i in 1:length(x)) {
for (j in 1:length(x_testSet)) {
ker_x_x.test[i, j] <- kernel(1, 0.1, x[i], x_testSet[j])
}
}
# compute the mean at our test points
# Lk <- solve(L, ker_x_x.test) # Issue was HERE!
Lk <- backsolve(L, ker_x_x.test, transpose = TRUE)
# mu <- t(Lk)%*%solve(L, y) # Issue was here
mu <- t(Lk)%*%forwardsolve(L, y)
# compute variance at our test points
# compute kernel
ker_x.test_x.test <- matrix(0, nrow = length(x_testSet), ncol =
length(x_testSet))
for (i in 1:length(x_testSet)) {
for (j in 1:length(x_testSet)) {
ker_x.test_x.test[i, j] <- kernel(1, 0.1, x_testSet[i], x_testSet[j])
}
}
s2 <- diag(ker_x.test_x.test) - colSums(Lk^2)
s <- sqrt(s2)