我有一个问题。我试图计算矢量之间的成对距离。我先解释一下这个问题:我有两组向量X
和Y
。 X
有三个向量x1
,x2
和x3
。 Y
有三个向量y1
,y2
和y3
。 X
和Y
中的注释向量分别为m
和n
。让数据集表示为此图像:
我正在尝试计算相似度矩阵,例如:
。现在解释了不同的颜色编码部分 - 所有那些标有0
的单元格都不需要计算。我故意把它作为100
(它可以是任何值)。必须计算灰色单元格。相似性得分计算为L2
的{{1}} + (xi-xj)
范数的L2
范数。
这意味着条目
(yi-yj)
我写了一个基本代码来执行此操作:
M((x_i,y_j), (x_k,y_l)) := norm(x_i-x_k,2) + norm(y_j-y_l,2)
对于基质的干燥样品运行:我得到了这些结果 -
clc;clear all;close all;
%% randomly generate data
m=3; n1=4; n2=6;
train_a_mean = rand(m,n1);
train_b_mean = rand(m,n2);
p = size(train_a_mean,1)*size(train_b_mean,1);
score_mean_ab = zeros(p,p);
%% This is to store the index variables
%% This is required for futu
idx1 = score_mean_ab;
idx2 = idx1; idx3 = idx1; idx4 = idx1;
a=1; b=1;
for i=1:size(score_mean_ab,1)
c = 1; d = 1;
for j=1:size(score_mean_ab,2)
if (a==c)
score_mean_ab(i,j) = 100;
else
%% computing distances between the different modalities and
%% summing them up
score_mean_ab(i,j) = norm(train_a_mean(a,:)-train_a_mean(c,:),2) ...
+ norm(train_b_mean(b,:)-train_b_mean(d,:),2);
end
%% saving the indices
idx1(i,j)=a; idx2(i,j)=b; idx3(i,j)=c; idx4(i,j)=d;
%% updating the values of c and d
if mod(d,size(train_a_mean,1))==0
c = c + 1;
d = 1;
else
d = d+1;
end
end
%% updating the values of a and b
if mod(b,size(train_a_mean,1))==0
a = a + 1;
b = 1;
else
b = b+1;
end
end
但是我的代码很慢。我花了很少的样本运行并得到了这些结果:
score_mean_ab =
100.0000 100.0000 100.0000 0.6700 1.6548 1.5725 0.8154 1.8002 1.7179
100.0000 100.0000 100.0000 1.6548 0.6700 1.5000 1.8002 0.8154 1.6454
100.0000 100.0000 100.0000 1.5725 1.5000 0.6700 1.7179 1.6454 0.8154
0.6700 1.6548 1.5725 100.0000 100.0000 100.0000 1.3174 2.3022 2.2200
1.6548 0.6700 1.5000 100.0000 100.0000 100.0000 2.3022 1.3174 2.1475
1.5725 1.5000 0.6700 100.0000 100.0000 100.0000 2.2200 2.1475 1.3174
0.8154 1.8002 1.7179 1.3174 2.3022 2.2200 100.0000 100.0000 100.0000
1.8002 0.8154 1.6454 2.3022 1.3174 2.1475 100.0000 100.0000 100.0000
1.7179 1.6454 0.8154 2.2200 2.1475 1.3174 100.0000 100.0000 100.0000
我的问题:
m=3; n1=3; n2=3;
Elapsed time is 0.000363 seconds.
m=10; n1=3; n2=3;
Elapsed time is 0.042015 seconds.
m=10; n1=1800; n2=1800;
Elapsed time is 0.230046 seconds.
m=20; n1=1800; n2=1800;
Elapsed time is 4.309134 seconds.
m=30; n1=1800; n2=1800;
Elapsed time is 23.058106 seconds.
,m~100
和n1~2000
。我自己的代码在这一点上崩溃了。有没有优化的方法来做到这一点? 注意:这些向量实际上是行向量的形式,n2~2000
和n1
的值可能不相等。
答案 0 :(得分:3)
这是一种方法。这会计算所有条目。
m = 3; %// number of (row) vectors in X and in Y
n1 = 3; %// length of vectors in X
n2 = 3; %// length of vectors in Y
X = rand(m, n1); %// random data: X
Y = rand(m, n2); %// random data: Y
[ii, jj] = ndgrid(1:m);
U = reshape(sqrt(sum((X(ii,:)-X(jj,:)).^2, 2)), m, m);
V = reshape(sqrt(sum((Y(ii,:)-Y(jj,:)).^2, 2)), m, m);
result = U(ceil(1/m:1/m:m), ceil(1/m:1/m:m)) + repmat(V, m, m);
或者您可以使用bsxfun
代替ndgrid
:
U = sqrt(sum(bsxfun(@minus, permute(X, [1 3 2]), permute(X, [3 1 2])).^2, 3));
V = sqrt(sum(bsxfun(@minus, permute(Y, [1 3 2]), permute(Y, [3 1 2])).^2, 3));
result = U(ceil(1/m:1/m:m), ceil(1/m:1/m:m)) + repmat(V, m, m);
答案 1 :(得分:2)
您可以使用以下方式实现此目的:
m = 3; % Number of vectors in X/Y (must have same number of vectors)
XD = squareform(pdist(X)); %// == pdist2(X,X) but faster
YD = squareform(pdist(Y)); %// == pdist2(Y,Y) but faster
M = kron(XD,ones(m,m)) + repmat(YD,m,m);
请注意,为了使pdist
有效,必须将X
和Y
作为行向量。另外:忽略对角线块。
答案 2 :(得分:2)
假设A
为train_a_mean
而B
为train_b_mean
,以便在代码中轻松访问,您可以在此处使用两种方法访问最终目的地< / em>,这是输出score_mean_ab
的行方向最小索引。
方法#1
此方法基于bsxfun
获取norm
及其summations
以及获取线性索引以将"diagonal block"
元素设置为全部Infs
根据问题的要求。这是实施 -
%// Parameter
M = m^2;
%// Get pairwise norms
nm1 = sqrt(sum(bsxfun(@minus,A,permute(A,[3 2 1])).^2,2));
nm2 = sqrt(sum(bsxfun(@minus,B,permute(B,[3 2 1])).^2,2));
%// Get sum of norms and the final values
norm_sum = bsxfun(@plus,nm1,permute(nm2,[2 1 4 3]));
%// Get "diagonal block" elements and set them to all Infs
ind1 = bsxfun(@plus,[1:m:M]',[0:m-1]*(M+1)); %//'
ind2 = bsxfun(@plus,ind1(:),[0:m-1]*m^3);
norm_sum(ind2) = Inf;
[~,min_idx] = min(reshape(norm_sum,m,M,[]),[],2);
min_idx = reshape(reshape(min_idx,m,[])',[],1);
方法#2
这种方法ab(使用)matrix multiplication based distance matrix calculation
可能更快的解决方案。代码列在下一个 -
%// Parameters
nA = size(A,2);
nB = size(B,2);
M = m^2;
%// Get the pairwise norms for both A and B
A_t = A'; %//'
norm_a = sqrt([-2*A A.^2 ones(m,nA)]*[A_t ; ones(nA,m) ; A_t.^2])
norm_a(1:m+1:end) = 0;
B_t = B'; %//'
norm_b = sqrt([-2*B B.^2 ones(m,nB)]*[B_t ; ones(nB,m) ; B_t.^2])
norm_b(1:m+1:end) = 0;
%// Norm sums
norm_sum = reshape(bsxfun(@plus,norm_a(:).',norm_b(:)),m,m,[]) %//'
%// Set the "diagonal blocks" as all Infs
norm_sum(:,:,1:m+1:M) = Inf
%// Re-arrange into the desired 2D output and get the minimum indices
out = reshape(permute(reshape(permute(norm_sum,[1 3 2]),M,m,[]),[1 3 2]),M,M);
[~,min_idx] = min(out,[],2);