例如,我有9个变量和362个案例。我已经进行了PCA计算,发现前3个PCA坐标足够我。
现在,我在9维结构中有了新的观点,我想把它投射到主成分系统坐标。如何获得新的坐标?
%# here is data (362x9)
load SomeData
[W, Y] = pca(data, 'VariableWeights', 'variance', 'Centered', true);
%# orthonormal coefficient matrix
W = diag(std(data))\W;
% Getting mean and weights of data (for future data)
[data, mu, sigma] = zscore(data);
sigma(sigma==0) = 1;
%# New point in original 9dim system
%# For example, it is the first point of our input data
x = data(1,:);
x = bsxfun(@minus,x, mu);
x = bsxfun(@rdivide, x, sigma);
%# New coordinates as principal components
y0 = Y(1,:); %# point we should get in result
y = (W*x')'; %# our result
%# error
sum(abs(y0 - y)) %# 142 => they are not the same point
%# plot
figure()
plot(y0,'g'); hold on;
plot(y,'r');
如何获得投射到新主成分基础的新点的坐标?
答案 0 :(得分:7)
正在运作的主要谬误将积分转换为新的基础:
y = (W*x')';
维基百科说:
投影向量是矩阵的列
Y = W*·Z,
其中
Y is L×N, W is M×L, Z is M×N
,
但pca()
返回W
L×M
,尺寸Y
和NxL
<{1}}
所以,Matlab中的正确方程是:
y = x*W
以下是更正后的代码:
[W, Y] = pca(data, 'VariableWeights', 'variance', 'Centered', true);
W = diag(std(data))\W;
%# Getting mean and weights of data (for future data)
[~, mu, we] = zscore(data);
we(we==0) = 1;
%# New point in original 9dim system
%# For example, it is the first point of our input data
x = data(1,:);
x = bsxfun(@minus,x, mu);
x = bsxfun(@rdivide, x, we);
%# New coordinates as principal components
y = x*W;
y0 = Y(1,:);
sum(abs(y0 - y)) %# 4.1883e-14 ~= 0