经常在MATLAB中附加文本的最佳方法

时间:2013-06-28 19:31:49

标签: matlab

我正在编写一个matlab脚本,最终将文本行的hundreads输出到文件中。现在我只是继续添加如下文字:

Output = [];
Output = [Output NewText];

但我认为这是低效的,因为它必须每次都创建一个新的矩阵。什么是更好的方式。

在我准备好编写所有文本之前,我无法打开文件,所以我不能继续在输出文件上使用fprintf。

2 个答案:

答案 0 :(得分:5)

至少对我来说,没有一个明显的最佳答案。一些选项是:

  1. 正是您正在做的事情,逐渐附加到每次迭代的字符串
  2. 智能地增加你的累积字符串以减少重新分配的数量(@macduff答案的核心)
  3. 使用字符串的单元格数组,并智能地重新分配。 (我很确定)这只会强制重新分配指针,而不是完全重新分配字符串内容。
  4. 使用一些Java魔法来处理字符串累积。 Java库有许多有用的功能(例如StringBuilder类),但Matlab-Java接口很慢。
  5. 增量直接写入文件(我知道你已经在问题中删除了这个,但它仍然是一个有用的基线。)
  6. 我的直觉表明性能顺序是:

    • 最佳:(2或3)
    • 中:(4或5)
    • 最差:(1)

    但这并不明显。

    幸运的是,它很容易测试。所有5个选项(以及一些测试包装器)的实现包含在下面的大型测试块中。我的计算机上的结果(一台带有SSD的计算机,结果可能会有所不同)低于(格式化代码输出中添加的空格):

    -------------Start of file write speed tests.  (nLines = 1)------------
    Time for BaseLine operation:                0.001540 sec
    Time for AutoAllocate operation:            0.001264 sec
    Time for AutoAllocateCell operation:        0.003492 sec
    Time for JavaStringBuilder operation:       0.001395 sec
    Time for IncrementalWriteToFile operation:  0.001057 sec
    -------------Start of file write speed tests.  (nLines = 100)------------
    Time for BaseLine operation:                0.011909 sec
    Time for AutoAllocate operation:            0.014067 sec
    Time for AutoAllocateCell operation:        0.011517 sec
    Time for JavaStringBuilder operation:       0.021291 sec
    Time for IncrementalWriteToFile operation:  0.016213 sec
    -------------Start of file write speed tests.  (nLines = 10000)------------
    Time for BaseLine operation:                3.778957 sec
    Time for AutoAllocate operation:            1.048480 sec
    Time for AutoAllocateCell operation:        0.856269 sec
    Time for JavaStringBuilder operation:       1.657038 sec
    Time for IncrementalWriteToFile operation:  1.254080 sec
    -------------Start of file write speed tests.  (nLines = 100000)------------
    Time for BaseLine operation:              358.312820 sec
    Time for AutoAllocate operation:           10.349529 sec
    Time for AutoAllocateCell operation:        8.539117 sec
    Time for JavaStringBuilder operation:      16.520797 sec
    Time for IncrementalWriteToFile operation: 12.259307 sec
    

    所以,如果你使用“100”的线条,它可能并不重要;做任何事都行。如果你知道性能问题,那么我会使用“AutoAllocateCell”选项。这是非常简单的代码(见下文)。如果你没有足够的内存来将整个文件一次存储在内存中,我会使用“AutoAllocateCell”选项并定期刷新文件。


    测试代码:

    %Setup
    cd(tempdir);
    createLineLine = @(n, s) sprintf('[%04d]  %s\n', n, s);
    createRandomLine = @(n) createLineLine(n, char(randi([65 122],[1, round(rand*100)])));
    
    for nLines = [1 100 10000 100000]        
        fprintf(1, ['-------------Start of file write speed tests.  (nLines = ' num2str(nLines) ')------------\n']);
    
        %% Baseline -----------------------------
        strName = 'BaseLine';
        rng(28375213)
        tic;
    
        str = [];
        for ix = 1:nLines;
            str = [str createRandomLine(ix)];
        end
    
        fid = fopen(['WriteTest_' strName],'w');
        fprintf(fid, '%s', str);
        fclose(fid);
    
        fprintf(1, 'Time for %s operation: %f sec\n', strName, toc);
    
        %% AutoAllocated string -----------------------------
        strName = 'AutoAllocate';
        rng(28375213)
        tic;
    
        str = blanks(256);
        ixLastValid = 0;
        for ix = 1:nLines;
            strNewLine = createRandomLine(ix);
            while (ixLastValid+length(strNewLine)) > length(str)
                str(end*2) = ' ';  %Doubles length of string
            end
            str(ixLastValid + (1:length(strNewLine))) = strNewLine;
            ixLastValid = ixLastValid+length(strNewLine);
        end
    
        fid = fopen(['WriteTest_' strName],'w');
        fprintf(fid, '%s', str(1:ixLastValid));
        fclose(fid);
    
        fprintf(1, 'Time for %s operation: %f sec\n', strName, toc);
    
        %% AutoAllocated cell array -----------------------------
        strName = 'AutoAllocateCell';
        rng(28375213)
        tic;
    
        strs = cell(256,1);
        ixLastValid = 0;
        for ix = 1:nLines;
            if ix>length(strs);
                strs{end*2} = {};  %Doubles cell array size;
            end
            strs{ix} = createRandomLine(ix);
            ixLastValid = ixLastValid + 1;
        end
    
        fid = fopen(['WriteTest_' strName],'w');
        fprintf(fid, '%s', strs{1:ixLastValid});
        fclose(fid);
    
        fprintf(1, 'Time for %s operation: %f sec\n', strName, toc);
    
        %% Java string builder -----------------------------
        strName = 'JavaStringBuilder';
        rng(28375213)
        tic;
    
        sBuilder = java.lang.StringBuilder;
        for ix = 1:nLines;
            sBuilder.append(createRandomLine(ix));
        end
    
        fid = fopen(['WriteTest_' strName],'w');
        fprintf(fid, '%s', char(sBuilder.toString()));
        fclose(fid);
    
        fprintf(1, 'Time for %s operation: %f sec\n', strName, toc);
    
        %% Incremental write to file -----------------------------
        strName = 'IncrementalWriteToFile';
        rng(28375213)
        tic;
    
        fid = fopen(['WriteTest_' strName],'w');
        for ix = 1:nLines;
            fprintf(fid, '%s', createRandomLine(ix));
        end
        fclose(fid);
    
        fprintf(1, 'Time for %s operation: %f sec\n', strName, toc);
    end
    

答案 1 :(得分:1)

就像Oli在this question的评论中所说,一个字符是一个行向量,因此任何适用于行向量的技术都将用于字符串。首先分配你认为合理的,如1000个字符,然后如果你超出限制,加倍大小或选择你自己的算法。

这是一个陈腐的例子:

testStrings = {['Is this a dagger which I see before me,' sprintf('\n') ],...
['The handle toward my hand? Come, let me clutch thee.' sprintf('\n') ],...
['I have thee not, and yet I see thee still.' sprintf('\n') ],...
['Art thou not, fatal vision, sensible' sprintf('\n') ],...
['To feeling as to sight? or art thou but' sprintf('\n') ],...
['A dagger of the mind, a false creation,' sprintf('\n') ],...
['Proceeding from the heat-oppressed brain?' sprintf('\n') ],...
['I see thee yet, in form as palpable' sprintf('\n') ],...
['As this which now I draw.' sprintf('\n') ],...
['Thou marshall''st me the way that I was going;' sprintf('\n') ],...
['And such an instrument I was to use.' sprintf('\n') ],...
['Mine eyes are made the fools o'' the other senses,' sprintf('\n') ],...
['Or else worth all the rest; I see thee still,' sprintf('\n') ],...
['And on thy blade and dudgeon gouts of blood,' sprintf('\n') ],...
['Which was not so before. There''s no such thing:' sprintf('\n') ],...
['It is the bloody business which informs' sprintf('\n') ],...
['Thus to mine eyes. Now o''er the one halfworld' sprintf('\n') ],...
['Nature seems dead, and wicked dreams abuse' sprintf('\n') ],...
['The curtain''d sleep; witchcraft celebrates' sprintf('\n') ],...
['Pale Hecate''s offerings, and wither''d murder,' sprintf('\n') ],...
['Alarum''d by his sentinel, the wolf,' sprintf('\n') ],...
['Whose howl''s his watch, thus with his stealthy pace.' sprintf('\n') ],...
['With Tarquin''s ravishing strides, towards his design' sprintf('\n') ],...
['Moves like a ghost. Thou sure and firm-set earth,' sprintf('\n') ],...
['Hear not my steps, which way they walk, for fear' sprintf('\n') ],...
'Thy very stones prate of my whereabout,'};

A = zeros(1,1000);

idx = 1;
for ii=1:length(testStrings)
  str = testStrings{ii};
  N = length(str);
  eIdx = idx+N-1;
  if( eIdx > length(A) )
    A = [ A zeros(1,length(A)*2) ];
  end
  A( idx:(idx+N-1) ) = str;
  idx = idx + N;
end
fprintf('%s',char(A))