为什么gpytorch似乎不如scikit-learn准确?

时间:2019-01-27 15:49:22

标签: python machine-learning scikit-learn pytorch

我当前找到了gpytorch(https://github.com/cornellius-gp/gpytorch)。对于将GPR集成到pytorch中来说,这似乎是一个很好的软件包。最初的测试也呈阳性。与scikit-learn等其他软件包相比,使用gpytorch可以使用GPU-Power和智能算法来提高性能。

但是,我发现估计所需的超参数要困难得多。在scikit-learn中,它发生在后台并且非常健壮。我想从社区中获取一些原因,并讨论是否有比gpytorch文档中的示例提供更好的方法来估算这些参数。

为进行比较,我在gpytorch(https://github.com/cornellius-gp/gpytorch/blob/master/examples/03_Multitask_GP_Regression/Multitask_GP_Regression.ipynb)的官方页面上采用了提供的示例代码,并对其进行了两部分修改:

  1. 我使用其他内核(gpytorch.kernels.MaternKernel(nu = 2.5)代替gpytorch.kernels.RBFKernel())
  2. 我使用了不同的输出功能

在下面,我首先使用gpytorch提供代码。随后,我提供了scikit-learn的代码。最后,我比较结果

导入(对于gpytorch和scikit-learn):

import math
import torch
import numpy as np
import gpytorch

生成数据(用于gpytorch和scikit-learn):

n = 20
train_x = torch.zeros(pow(n, 2), 2)
for i in range(n):
    for j in range(n):
        # Each coordinate varies from 0 to 1 in n=100 steps
        train_x[i * n + j][0] = float(i) / (n-1)
        train_x[i * n + j][1] = float(j) / (n-1)

train_y_1 = (torch.sin(train_x[:, 0]) + torch.cos(train_x[:, 1]) * (2 * math.pi) + torch.randn_like(train_x[:, 0]).mul(0.01))/4
train_y_2 = torch.sin(train_x[:, 0]) + torch.cos(train_x[:, 1]) * (2 * math.pi) + torch.randn_like(train_x[:, 0]).mul(0.01)

train_y = torch.stack([train_y_1, train_y_2], -1)

test_x = torch.rand((n, len(train_x.shape)))
test_y_1 = (torch.sin(test_x[:, 0]) + torch.cos(test_x[:, 1]) * (2 * math.pi) + torch.randn_like(test_x[:, 0]).mul(0.01))/4
test_y_2 = torch.sin(test_x[:, 0]) + torch.cos(test_x[:, 1]) * (2 * math.pi) + torch.randn_like(test_x[:, 0]).mul(0.01)
test_y = torch.stack([test_y_1, test_y_2], -1)

现在可以根据引用的文档中提供的示例中的描述进行估算:

torch.manual_seed(2) # For a more robust comparison
class MultitaskGPModel(gpytorch.models.ExactGP):
    def __init__(self, train_x, train_y, likelihood):
        super(MultitaskGPModel, self).__init__(train_x, train_y, likelihood)
        self.mean_module = gpytorch.means.MultitaskMean(
            gpytorch.means.ConstantMean(), num_tasks=2
        )
        self.covar_module = gpytorch.kernels.MultitaskKernel(
            gpytorch.kernels.MaternKernel(nu=2.5), num_tasks=2, rank=1
        )

    def forward(self, x):
        mean_x = self.mean_module(x)
        covar_x = self.covar_module(x)
        return gpytorch.distributions.MultitaskMultivariateNormal(mean_x, covar_x)


likelihood = gpytorch.likelihoods.MultitaskGaussianLikelihood(num_tasks=2)
model = MultitaskGPModel(train_x, train_y, likelihood)

# Find optimal model hyperparameters
model.train()
likelihood.train()

# Use the adam optimizer
optimizer = torch.optim.Adam([
    {'params': model.parameters()},  # Includes GaussianLikelihood parameters
], lr=0.1)

# "Loss" for GPs - the marginal log likelihood
mll = gpytorch.mlls.ExactMarginalLogLikelihood(likelihood, model)

n_iter = 50
for i in range(n_iter):
    optimizer.zero_grad()
    output = model(train_x)
    loss = -mll(output, train_y)
    loss.backward()
    # print('Iter %d/%d - Loss: %.3f' % (i + 1, n_iter, loss.item()))
    optimizer.step()

# Set into eval mode
model.eval()
likelihood.eval()

# Make predictions
with torch.no_grad(), gpytorch.settings.fast_pred_var():
    predictions = likelihood(model(test_x))
    mean = predictions.mean
    lower, upper = predictions.confidence_region()

test_results_gpytorch = np.median((test_y - mean) / test_y, axis=0)

在下面,我提供scikit-learn的代码。哪个更方便^^:

from sklearn.gaussian_process import GaussianProcessRegressor
from sklearn.gaussian_process.kernels import WhiteKernel, Matern
kernel = 1.0 * Matern(length_scale=0.1, length_scale_bounds=(1e-5, 1e5), nu=2.5) \
         + WhiteKernel()
gp = GaussianProcessRegressor(kernel=kernel, alpha=0.0).fit(train_x.numpy(),
                                                            train_y.numpy())
# x_interpolation = test_x.detach().numpy()[np.newaxis, :].transpose()
y_mean_interpol, y_std_norm = gp.predict(test_x.numpy(), return_std=True)

test_results_scitlearn = np.median((test_y.numpy() - y_mean_interpol) / test_y.numpy(), axis=0)

最后,我比较结果:

comparisson = (test_results_scitlearn - test_results_gpytorch)/test_results_scitlearn
print('Variable 1: scitkit learn is more accurate my factor: ' + str(abs(comparisson[0]))
print('Variable 2: scitkit learn is more accurate my factor: ' + str(comparisson[1]))

不幸的是,我没有找到一种简单的方法来修复scikit-learn的种子。我上次运行代码时,它返回:

  

变量1:scitkit学习更准确,我的因素:11.362540360431087

     

变量2:scitkit学习更准确我的因素:29.64760087022618

在使用gpytorch的情况下,我假设优化器在某些局部最优状态下运行。但是我无法想到仍然使用pytorch的任何更强大的优化算法。

我期待着建议!

Lazloo

1 个答案:

答案 0 :(得分:2)

(关于您为此here创建的GitHub问题,我也会回答您的问题)

这主要是因为您在sklearn和gpytorch中使用了不同的模型。特别地,默认情况下,sklearn在多输出设置下学习独立的GP(例如,参见讨论here)。在GPyTorch中,您使用了Bonilla et al, 2008中引入的多任务GP方法。纠正这种差异会产生:

  

test_results_gpytorch = [5.207913e-04 -8.469360e-05]

     

test_results_scitlearn = [3.65288816e-04 4.79017145e-05]