Question

我使用Accord dotnet框架进行PrincipalComponentAnalysis。

我有一个计算的距离矩阵，然后我在它上面应用KPCA：

Dim pca = new KernelPrincipalComponentAnalysis()
pca.Learn(distances)
pca.NumberOfOutputs = 2
Dim actual()() As double = pca.Transform(distances)

这很好用。但是如果矩阵很大（例如2000x2000），pca.learn需要几分钟。有没有办法学习采样矩阵（例如500x500）以转换大矩阵？我试过了：

pca.Learn(sampling)
pca.Transform(distances)

但是我收到错误，因为矩阵的大小不正确。

最诚挚的问候让 - 米歇尔

Answer 1

如果您称为距离矩阵的矩阵实际上是一个核矩阵，那么您应该可以做什么。由于您正在使用KPCA，我假设情况可能如此，因此我将在下面展示使用不同大小的内核矩阵创建使用KPCA学习和转换方法的方法。

String

现在，让我们说我们已经以某种方式提供了他们的内核矩阵K.注意：计算K的方法与

类似

// Let's say those were our original data points
double[][] data =
{
    new double[] { 2.5,  2.4 },
    new double[] { 0.5,  0.7 },
    new double[] { 2.2,  2.9 },
    new double[] { 1.9,  2.2 },
    new double[] { 3.1,  3.0 },
    new double[] { 2.3,  2.7 },
    new double[] { 2.0,  1.6 },
    new double[] { 1.0,  1.1 },
    new double[] { 1.5,  1.6 },
    new double[] { 1.1,  0.9 }
};

现在，假设K已经可用，我们可以创建一个KPCA

double[] mean = data.Mean(dimension: 0);
double[][] x = data.Subtract(mean, dimension: 0);    
Linear kernel = new Linear();
double[][] K = kernel.ToJagged(x);

然后使用

学习它

var pca = new KernelPrincipalComponentAnalysis(kernel, PrincipalComponentMethod.KernelMatrix);

对于上面的例子，我们有

pca.Learn(K); // note: we pass the kernel matrix instead of the data points

现在，最后，回答有关如何使用我们的KPCA转换不同大小的内核矩阵的问题，我们可以使用

// Those are the expected eigenvalues, in descending order:
double[] eigenvalues = pca.Eigenvalues.Divide(data.Length - 1); //  { 1.28, 0.049 };

// And this will be their proportion:
double[] proportions = pca.ComponentProportions; // { 0.96, 0.03 };

// We can transform the inputs using
double[][] actual = pca.Transform(K);

// The output should be similar to
double[,] expected = new double[,]
{
    {  0.827970186, -0.175115307 },
    { -1.77758033,   0.142857227 },
    {  0.992197494,  0.384374989 },
    {  0.274210416,  0.130417207 },
    {  1.67580142,  -0.209498461 },
    {  0.912949103,  0.175282444 },
    { -0.099109437, -0.349824698 },
    { -1.14457216,   0.046417258 },
    { -0.438046137,  0.017764629 },
    { -1.22382056,  -0.162675287 },
}.Multiply(-1);

Accord框架PrincipalComponentAnalysis对大数据的影响

1 个答案: