Question

我有10张图片（18x18）。我将这些图像保存在名为images[324][10]的数组中，其中数字324表示图像的像素数，数字10表示我拥有的图像总数。

我想将这些图像用于神经元网络，但324是一个很大的数字作为输入，因此我想减少这个数字，但保留尽可能多的信息。

我听说您可以使用实现PCA的princomp函数执行此操作。

问题是我没有找到任何关于如何使用这个功能的例子，特别是对我的情况。

如果我跑

[COEFF, SCORE, latent] = princomp(images);

它运行正常，但我怎样才能获得数组newimages[number_of_desired_features][10]？

Answer 1

PCA可能是一个正确的选择（但不是唯一的选择）。虽然，您应该注意这一事实，即PCA不会自动减少输入数据功能的数量。我建议你阅读本教程：http://arxiv.org/pdf/1404.1100v1.pdf - 它是我用来理解PCA的，它对初学者非常有用。

回到你的问题。图像是324维空间中的矢量。在这个空间中，第一个基本矢量是左上角有一个白色像素，下一个像素是下一个像素白色，另一个是黑色 - 依此类推。它可能不是表示此图像数据的最佳基矢量集。 PCA计算新的基矢量（COEFF矩阵 - 表示为旧矢量空间中的值的新矢量）和新的图像矢量值（SCORE矩阵）。此时您根本没有丢失任何数据（没有特征编号减少）。但是，你可以停止使用一些新的基本向量，因为它们可能与noice连接，而不是数据本身。这一切都在教程中详细描述。

images = rand(10,324);
[COEFF, SCORE] = princomp(images);
reconstructed_images = SCORE / COEFF + repmat(mean(images,1), 10, 1);
images - reconstructed_images
%as you see there are almost only zeros - the non-zero values are effects of small numerical errors
%its possible because you are only switching between the sets of base vectors used to represent the data
for i=100:324
    SCORE(:,i) = zeros(10,1);
end
%we remove the features 100 to 324, leaving only first 99
%obviously, you could take only the non-zero part of the matrix and use it
%somewhere else, like for your neural network
reconstructed_images_with_reduced_features = SCORE / COEFF + repmat(mean(images,1), 10, 1);
images - reconstructed_images_with_reduced_features
%there are less features, but reconstruction is still pretty good

在下列情况下如何使用Matlab的princomp函数？

1 个答案: