Question

以下是八度音码（kmeans的一部分）

centroidSum = zeros(K);
valueSum = zeros(K, n);
for i = 1 : m  
  for j = 1 : K    
    if(idx(i) == j)
      centroidSum(j) = centroidSum(j) + 1;
      valueSum(j, :) = valueSum(j, :) + X(i, :);
    end
  end
end

代码有效，是否可以对代码进行矢量化？没有if语句，很容易对代码进行矢量化，但我们如何使用if语句对代码进行矢量化？

Answer 1

我假设代码的目的是计算m维空间中一组n个数据点的子集的质心，其中点存储在矩阵{{1 （点x坐标）和向量X为每个数据点指定该点所属的子集（1 ... idx）。然后部分矢量化是：

通过索引消除centroid = zeros(K, n) for j = 1 : K centroid(j, :) = mean(X(idx == j, :)); end，特别是逻辑索引：if给出一个布尔数组，指示哪些数据点属于子集idx == j。

我认为也可能摆脱第二个for-loop，但这会导致非常复杂，难以理解的代码。

Answer 2

简介和解决方案代码

这可以是一种基于 -

的完全矢量化方法

accumarray：用于累积为计算valueSum而进行的求和。这也引入了一种技术如何使用 accumarray on a 2D matrix along a certain direction ，这是不可能直接使用它。
bsxfun：用于计算所有列的线性索引，以匹配idx的行索引。

这是实施 -

%// Store no. of columns in X for frequent usage later on
ncols = size(X,2); 

%// Find indices in idx that are within [1:k] range, call them as labels
%// Also, find their locations in that range array, call those as pos
[pos,id] = ismember(idx,1:K);
labels = id(pos);
%// OR with bsxfun: [pos,labels] = find(bsxfun(@eq,idx(:),1:K));

%// Find all labels, i.e. across all columns of X
all_labels = bsxfun(@plus,labels(:),[0:ncols-1]*K);

%// Get truncated X corresponding to all indices matches across all columns
X_cut = X(pos,:);

%// Accumulate summations within each column based on the labels.
%// Note that accumarray doesn't accept matrices, so we were required
%// to create all_labels that had same labels within each column and
%// offsetted at constant intervals from consecutive columns
acc1 = accumarray(all_labels(:),X_cut(:));

%// Regularise accumulated array and reshape back to a 2D array version
acc1_reg2D = [acc1 ; zeros(K*ncols - numel(acc1),1)]; 
valueSum = reshape(acc1_reg2D,[],ncols);
centroidSum = histc(labels,1:K); %// Get labels counts as centroid sums

基准代码

%// Datasize parameters
K = 5000;
n = 5000;
m = 5000;

idx = randi(9,1,m);
X = rand(m,n);

disp('----------------------------- With Original Approach')
tic
centroidSum1 = zeros(K,1);
valueSum1 = zeros(K, n);
for i = 1 : m  
  for j = 1 : K    
    if(idx(i) == j)
      centroidSum1(j) = centroidSum1(j) + 1;
      valueSum1(j, :) = valueSum1(j, :) + X(i, :);
    end
  end
end
toc, clear valueSum1 centroidSum1

disp('----------------------------- With Proposed Approach')
tic
%// ... Code from earlied mentioned section
toc

运行时结果

----------------------------- With Original Approach
Elapsed time is 1.235412 seconds.
----------------------------- With Proposed Approach
Elapsed time is 0.379133 seconds.

Answer 3

不确定其运行时性能，但这是一个非复杂的矢量化实现：

b = idx == 1:K;
centroids = (b' * X) ./ sum(b)';

Answer 4

对计算进行矢量化会使性能产生巨大差异。基准

原始代码，
A. Donda和
Tom的完整矢量化，

给了我以下结果：

Original Code: Elapsed time is 1.327877 seconds.

Partial Vectorization: Elapsed time is 0.630767 seconds.

Full Vectorization: Elapsed time is 0.021129 seconds.

此处的基准代码：

%// Datasize parameters
K = 5000;
n = 5000;
m = 5000;

idx = randi(9,1,m);
X = rand(m,n);

fprintf('\nOriginal Code: ')
tic
centroidSum1 = zeros(K,1);
valueSum1 = zeros(K, n);
for i = 1 : m  
  for j = 1 : K    
    if(idx(i) == j)
      centroidSum1(j) = centroidSum1(j) + 1;
      valueSum1(j, :) = valueSum1(j, :) + X(i, :);
    end
  end
end
centroids = valueSum1 ./ centroidSum1;
toc, clear valueSum1 centroidSum1 centroids

fprintf('\nPartial Vectorization: ')
tic
centroids = zeros(K,n);
for k = 1:K
    centroids(k,:) = mean( X(idx == k, :) );
end
toc, clear centroids

fprintf('\nFull Vectorization: ')
tic
centroids = zeros(K,n);
b = idx == 1:K;
centroids = (b * X) ./ sum(b)';
toc

注意，我在原始代码中添加了一条额外的行，以元素方式将valueSum1除以centroidSum1，以使每种类型的代码的输出相同。

最后，我知道这不是一个严格的答案，但是我没有足够的声誉来添加评论，我认为基准数据对任何人都有用。正在学习MATLAB（和我一样），需要一些额外的动力来掌握矢量化。

Vectorize octave / matlab代码

4 个答案: