加速代码:记忆与速度

时间:2017-06-23 17:44:10

标签: matlab performance memory-management

我必须在MATLAB中创建和存储几个矩阵。如果我选择使用多维数组,那么我就会遇到内存问题。如果我选择去一个单元格,那么代码非常慢。如何提高计算速度和内存使用率?

这是我的代码的简化版本

%%%%% MAIN FILE %%%%%%
rng default;  %for reproducibility

%% Paramaters
N=20; 
M=400;
R=M/2; 
B=M/2;

%% Generate the matrix data of dimension NM x(1+1+N +(N-1)+(N-1))
data=[kron((1:1:M)', ones(N,1)) repmat((1:1:N)', M,1) randn(M*N, N +(N-1)+(N-1))]; 

%% Generate the matrix unob of dimension NMx(1+1+N-1)xR
unob=[repmat(data(:,1:2),1,1,R)  randn(M*N,N-1,R)]; 

%% Option 1: MEMORY PROBLEM
%bootdata1 and bootunob1 have respectively dimension NMx(1+1+N +(N-1)+(N-1))xB and NMx(1+1+N-1)xRxB
[bootdata1, bootunob1]=boot1(N,M,B,R,data, unob);

%% Option 2: SLOW, NEVER ENDING
%bootdata is a matrix of dimension NMx(1+1+N +(N-1)+(N-1))xB
%bootunob is a cell of dimension Bx1 with bootunob{b} that is a matrix of dimension NMx(1+1+N-1)
[bootdata, bootunob]=boot(N,M,B,R,data, unob) ; 

功能boot1

function [bootdata, bootunob]=boot1(N,M,B,R,data, unob)    

         %Allocate space
         bootdata=zeros(N*M,1+1+N+(N-1)+(N-1), B);
         bootunob=zeros(N*M,(1+1+N-1),R,B); 

         for b=1:B   

             %Draw uniformly at random with replacement M integers from {1,...,M} and 
             %store them into the vector networkindices of dimension 1xM
             networkindices =randi([1 M],M,1); 

             %Fill bootdata(:,:,b) and bootunob(:,:,:,b)
             for m=1:M
                 bootdata((m-1)*N+1:m*N, :,b)=data(data(:,1)==networkindices(m),:);
                 bootdata((m-1)*N+1:m*N,1,b)=m*(ones(N,1)); %change indices

                 for r=1:R
                     bootunob((m-1)*N+1:m*N, :,r,b)=unob(unob(:,1,r)==networkindices(m),:,r); 
                     bootunob((m-1)*N+1:m*N,1,r,b)=m*(ones(N,1));
                 end
             end


         end
end

功能boot

function [bootdata, bootunob]=boot(N,M,B,R,data, unob)    

         %Allocate space
         bootdata=zeros(N*M,1+1+N+(N-1)+(N-1), B);
         bootunob=cell(B,1);

         for b=1:B
             bootunob{b}=zeros(N*M,(1+1+N-1),R) ;

             %Draw uniformly at random with replacement M integers from {1,...,M} and 
             %store them into the vector networkindices of dimension 1xM
             networkindices =randi([1 M],M,1); 

             %Fill bootdata(:,:,b) and bootunob{b}
             for m=1:M
                 bootdata((m-1)*N+1:m*N, :,b)=data(data(:,1)==networkindices(m),:);
                 bootdata((m-1)*N+1:m*N,1,b)=m*(ones(N,1)); %change indices

                 for r=1:R
                     bootunob{b}((m-1)*N+1:m*N, :,r)=unob(unob(:,1,r)==networkindices(m),:,r); 
                     bootunob{b}((m-1)*N+1:m*N,1,r)=m*(ones(N,1));
                 end
             end
         end

end

1 个答案:

答案 0 :(得分:1)

新方法

显着更快,我的计算机上的写入速度大约为100-300MB / s,尽管瓶颈已经成为代码的一部分,请参阅本节末尾的评论。

新的bootFaster方法:

function [bootDataFiles, bootUnobFiles] = bootFaster(N, M, B, R, data, unob)
    % Temporary Variables
    % Based on some simple math, these slices should be OKish. This will
    % crash and burn if you try to use the save with files greater than
    % 2^31 bytes, so just be careful with that.
    bootDataTemp = zeros(N*M, 1+1+N+(N-1)+(N-1), 1);
    bootUnobTemp = zeros(N*M, 1+1+(N-1), R, 1);

    % Matricies containing the file names for the .mat files. There are B
    % rows.
    bootDataFiles = zeros(B, 17);
    bootUnobFiles = zeros(B, 17);

    for b = 1:B
        networkIndices = randi([1 M], M, 1);

        for m = 1:M
            bootDataTemp((m-1)*N+1:m*N, :) = data(data(:, 1) == networkIndices(m), :);
            bootDataTemp((m-1)*N+1:m*N, 1) = m * (ones(N, 1)); %change indices

            for r = 1:R
                bootUnobTemp((m-1)*N+1:m*N, :, r) = unob(unob(:, 1, r) == networkIndices(m), :, r); 
                bootUnobTemp((m-1)*N+1:m*N, 1, r) = m * (ones(N, 1));
            end
        end

        % Creates the file name for the bth matrix.
        % NOTE: if you change the 5 it will change the length of each file
        % name, you will have to change the number of columns in
        % bootDataFiles and bootUnobFiles accordingly.
        bootDataFileB = sprintf('bootData%5.i.mat', b);
        bootUnobFileB = sprintf('bootUnob%5.i.mat', b);

        % Writes the contents of the tempoary variables to the file
        save(bootDataFileB, 'bootDataTemp', '-v6');
        save(bootUnobFileB, 'bootUnobTemp', '-v6');

        % Storing the file names.
        bootDataFiles(b, :) = bootDataFileB;
        bootUnobFiles(b, :) = bootUnobFileB;
    end

    % Convert the values back to chars (each row will not be a "string")
    bootDataFiles = char(bootDataFiles);
    bootUnobFiles = char(bootUnobFiles);
end

要使用此数据,您现在可以使用matfile方法,如下所示。

[bootDataFiles, bootUnobFiles] = bootFaster(N,M,B,R,data, unob);

bootDataAtbFile = matfile(bootDataFiles(b, :));
% Note the use of "bootDataTemp" to access the data, you have to use
% the name of the temporary variable that you stored the data in inside
% bootFaster. E.g. to access the bootUnob data you would have to use
% bootUnobTemp, or whatever you choose to rename them to.
bootDataAtb = bootDataAtbFile.bootDataTemp;

最后的说明。这一行:

bootUnobTemp((m-1)*N+1:m*N, :, r) = unob(unob(:, 1, r) == networkIndices(m), :, r);

非常慢。例如。 N = 30且M = 200.该行占用执行时间的2/3。 高度会建议您查看是否可以重构此内容。

旧方法 - 由于压缩而在2015b中太慢

使用matfile函数和写入文件的能力变得非常容易实现。你必须改变你的代码并不多。请参阅下面的代码,了解它的工作原理。

在您的主要文件中,您需要执行以下操作:

file = bootFast(N,M,B,R,data, unob);

bootdata1F = file.bootdata;
bootunob1F = file.bootunob;

bootFast

function file = bootFast(N, M, B, R, data, unob)
% Opens the file, or creates it if it doesn't exsit
file = matfile('output', 'Writable', true);

% Sets the sizes of the variables
file.bootdata(N*M, 1+1+N+(N-1)+(N-1), B) = 0;
file.bootunob(N*M, 1+1+(N-1), R, B) = 0;

bootdataTemp = file.bootdata(:, :, 1);
bootunobTemp = file.bootunob(:, :, :, 1);

for b = 1:B
    networkindices = randi([1 M], M, 1);

    for m = 1:M
        bootdataTemp((m-1)*N+1:m*N, :) = data(data(:, 1) == networkindices(m), :);
        bootdataTemp((m-1)*N+1:m*N, 1) = m * (ones(N, 1)); %change indices

        for r = 1:R
            bootunobTemp((m-1)*N+1:m*N, :, r) = unob(unob(:, 1, r) == networkindices(m), :, r); 
            bootunobTemp((m-1)*N+1:m*N, 1, r) = m * (ones(N, 1));
        end
    end
    file.bootdata(:, :, b) = bootdataTemp;
    file.bootunob(:, :, :, b) = bootunobTemp;
end
end

我有限的磁盘写入每次只发生一次,试图加快速度,但你可能需要根据你拥有的内存来确定具体位置。

P.S。我希望你有一个SSD。