Matlab:如何使用大量表达式优化代码

时间:2015-11-12 10:36:53

标签: matlab optimization

我有一个字符串数组TrajCompact(106x1)。 对于每个字符串,我想找到'00','11','22','33','44','12','21'等等并计算它们。我创建了一个代码,它给了我一个正确的答案,但它没有效率:它很长而且不聪明。我的代码是

for k=1:106
%% Determinazione elementi diagonale matrice
matchstarts_00(k) = regexp(TrajCompact(k,1), '0.+?0'); %for '00'
matchcounts_00(k) = cellfun(@numel, matchstarts_00(k));

matchstarts_11(k) = regexp(TrajCompact(k,1), '1.+?1'); %for '11'
matchcounts_11(k) = cellfun(@numel, matchstarts_11(k));

matchstarts_22(k) = regexp(TrajCompact(k,1), '2.+?2'); %for '11'
matchcounts_22(k) = cellfun(@numel, matchstarts_22(k));

matchstarts_33(k) = regexp(TrajCompact(k,1), '3.+?3'); %for '11'
matchcounts_33(k) = cellfun(@numel, matchstarts_33(k));

matchstarts_44(k) = regexp(TrajCompact(k,1), '4.+?4'); %for '11'
matchcounts_44(k) = cellfun(@numel, matchstarts_44(k));
end

count_00=sum(matchcounts_00);    
count_11=sum(matchcounts_11);
count_22=sum(matchcounts_22);
count_33=sum(matchcounts_33);
count_44=sum(matchcounts_44);

你能给我建议优化它吗?感谢

2 个答案:

答案 0 :(得分:1)

So you can use findstr instead of a regex over each one; findstr is actually just a regex, but it is a also a Matlab function - so you can use it in parallel (which basically removes the need of the for loop). There are a few ways to go:

say your Cell Array is:

Test = {'020103101adfiohjas','020123101a01dfasad'}

method 1:

tic
A = cellfun(@(x) size(findstr(x,'01'),2), Test, 'UniformOutput', 0)
total = sum(cell2mat(A))
toc

Elapsed time is 0.015609 seconds.

method 2:

T = char([Test{:}])
tic
total = size(findstr(T,'01'),2)
toc

Elapsed time is 0.007199 seconds.

Method 2 requires you to collapse all your cells into 1 big cell, and uses findstr to quickly find the number of '01' in the Char Array.

Method 1 is much more powerful, you can use it to obtain all the indexes of '01' and etc, but judging by tictoc, method 2 is faster.

答案 1 :(得分:1)

regexp accepts multiple strings to match against if you specify them as a cell array. In fact, you're already passing it a one-element cell array by passing it TrajCompact(k,1) rather than TrajCompact{k,1}. If you had used the latter, you would have been able to just use numel rather than cellfun to get the counts. But since you actually do have multiple strings to match against you can do them all in one go:

matchstarts_00 = regexp(TrajCompact, '0.+?0');
matchcounts_00 = cellfun(@numel, matchstarts_00);

You can also vectorize the operation for each regular expression that you want to test, but because you want a result for every combination of an element of TrajCompact and a given regular expression, you'll need to do this with a loop.

Your examples only create counts for pairs of indices on the diagonal of a matrix, but your question suggests you want all pairs of indices for a matrix. Either way, it's tidy to set up arrays with one element for each regular expression, and one array for each digit, stored as a cell array. For all combinations (including '12' '21', etc):

[digits{1:2}] = ndgrid(0:4);

or for diagonals:

[digits{1:2}] = deal(0:4);

You can then loop over that array to assemble the regular expression for each combination and get the counts. I'm assuming the "starts" array doesn't actually need to be recalled, so rather than storing that in a cell array I'm overwriting the same variable for each iteration, but it should be clear how you would store that in a cell array as well if need be.

matchcounts = cell(size(digits{1}));
for k = 1:numel(matchcounts)
    matchstarts = regexp(TrajCompact, sprintf('%d.+?%d', digits{1}(k), digits{2}(k)));
    matchcounts{k} = cellfun(@numel,matchstarts);
end