在MatLab中查找四个矩阵中的接近(或类似)值

时间:2013-08-20 14:22:58

标签: matlab fuzzy-search

我正在更新一些继承的遗留代码,并找到了一个有趣的“方法”来解决我无法优化的问题。

有一组数据可以使用以下方法进行模拟:

approximateEntries = 5e6;
maxEntryValue = 5e12;
maxExtraEvents = 10e3;

event1 = randperm( maxEntryValue, approximateEntries + randi( maxExtraEvents ) );
event1 = sort(event1);
event2 = randperm( maxEntryValue, approximateEntries + randi( maxExtraEvents ) );
event2 = sort(event2);
event3 = randperm( maxEntryValue, approximateEntries + randi( maxExtraEvents ) );
event3 = sort(event3);
event4 = randperm( maxEntryValue, approximateEntries + randi( maxExtraEvents ) );
event4 = sort(event4);

data = [[ones(length(event1),1); 2*ones(length(event2),1); 3*ones(length(event3),1); 4*ones(length(event4),1)], ...
       [event1'; event2'; event3'; event4']];
data = sortrows(data,2);

clear approximateEntries;
clear maxEntryValue;
clear maxExtraEvents;

也就是说,由于之前编写代码的方式,我可以访问四个数组,每个数组包含大约五百万个元素,一个包含四个数组的大数组连接,排序并标记在另一列中,使用哪个数组元素最初来自。如果我可以避免在解决这个问题的过程中使用单个大型数组,那么可以从代码中删除它,因为以后不需要它。

我们希望找到四个“事件”矩阵中值之间的近似(到+/- 1000之内)匹配的不同特征。值可能不一定在相同的索引处,但每个“事件”矩阵将按升序排序。我们正在寻找:

  • 每个“事件”矩阵的值与该矩阵在+/- 1000范围内唯一的总次数。
  • 每个“事件”矩阵的值在两个矩阵中出现+/- 1000的总次数,每个矩阵的总分别为。
  • 每个“事件”矩阵的值+/- 1000仅在三个矩阵中出现的总次数,每个矩阵的三元组总计为sperate。
  • 所有四个矩阵中出现+/- 1000值的总次数

遗留代码尝试以下面这种令人难以置信的方式执行此操作(我知道这不完全正确,这就是它正在更新的原因)

onlyEvent1 = 0;
onlyEvent2 = 0;
onlyEvent3 = 0;
onlyEvent4 = 0;
onlyEvent1Event2 = 0;
onlyEvent1Event3 = 0;
onlyEvent1Event4 = 0;
onlyEvent2Event3 = 0;
onlyEvent2Event4 = 0;
onlyEvent3Event4 = 0;
onlyEvent1Event2Event3 = 0;
onlyEvent1Event2Event4 = 0;
onlyEvent1Event3Event4 = 0;
onlyEvent2Event3Event4 = 0;
allEvents = 0;

maxIndex = length(data);
workingIndex = 1;

fullGate = 2000;

while workingIndex < maxIndex 
    subIndex = 0;

    withinRange = true;

    % Prepare a buffer
    buffer = zeros(200,2);

    while ((workingIndex + subIndex) < maxIndex) && (withinRange == true)

        if subIndex > 0

            timeDifference = data(workingIndex + subIndex,2) - buffer(subIndex,2);

            if timeDifference <= fullGate
                buffer(subIndex + 1,:) = data(workingIndex + subIndex,:);
                subIndex = subIndex + 1; 
            else
                withinRange = false;
            end % if

        else

            buffer(subIndex + 1,:) = data(workingIndex + subIndex,:);
            subIndex = subIndex + 1; 

        end % if      

    end % while

        event1 = false;
        event2 = false;
        event3 = false;
        event4 = false;

        compareIndex = 1;

        while (buffer(compareIndex) ~= 0) && (compareIndex < length(buffer))
            if buffer(compareIndex,1) == 1
                event1 = true; 
            else
                if buffer(compareIndex,1) == 2
                    event2 = true; 
                else
                    if buffer(compareIndex,1) == 3
                        event3 = true; 
                    else
                        % Should really only be four
                       event4 = true; 
                    end % if is 3
                end % if is 2
            end % if is 1
            compareIndex = compareIndex + 1;
        end % while buffer

        if (event1 == true) && (event2 == true) && (event3 == true) && (event4 == true)
            allEvents = allEvents + 1;
        else
            if (event1 == true) && (event2 == true) && (event3 == true) && (event4 == false)
                onlyEvent1Event2Event3 = onlyEvent1Event2Event3 + 1;
            else
                if (event1 == true) && (event2 == true) && (event3 == false) && (event4 == true)
                    onlyEvent1Event2Event4 = onlyEvent1Event2Event4 + 1;
                else
                    if (event1 == true) && (event2 == false) && (event3 == true) && (event4 == true)
                        onlyEvent1Event3Event4 = onlyEvent1Event3Event4 + 1;
                    else
                        if (event1 == false) && (event2 == true) && (event3 == true) && (event4 == true)
                            onlyEvent2Event3Event4 = onlyEvent2Event3Event4 + 1;
                        else
                            if (event1 == true) && (event2 == true) && (event3 == false) && (event4 == false)
                                onlyEvent1Event2 = onlyEvent1Event2 + 1;
                            else
                                if (event1 == true) && (event2 == false) && (event3 == true) && (event4 == false)
                                    onlyEvent1Event3 = onlyEvent1Event3 + 1;
                                else
                                    if (event1 == true) && (event2 == false) && (event3 == false) && (event4 == true)
                                        onlyEvent1Event4 = onlyEvent1Event4 + 1;
                                    else
                                        if (event1 == false) && (event2 == true) && (event3 == true) && (event4 == false)
                                            onlyEvent2Event3 = onlyEvent2Event3 + 1;
                                        else
                                            if (event1 == false) && (event2 == true) && (event3 == false) && (event4 == true)
                                                onlyEvent2Event4 = onlyEvent2Event4 + 1;
                                            else
                                                if (event1 == false) && (event2 == false) && (event3 == true) && (event4 == true)
                                                    onlyEvent3Event4 = onlyEvent3Event4 + 1;
                                                else
                                                    if (event1 == true) && (event2 == false) && (event3 == false) && (event4 == false)
                                                        onlyEvent1 = onlyEvent1 + 1;
                                                    else
                                                        if (event1 == false) && (event2 == true) && (event3 == false) && (event4 == false)
                                                            onlyEvent2 = onlyEvent2 + 1;
                                                        else
                                                            if (event1 == false) && (event2 == false) && (event3 == true) && (event4 == false)
                                                                onlyEvent3 = onlyEvent3 + 1;
                                                            else
                                                                onlyEvent4 = onlyEvent4 + 1;
                                                            end % if 3
                                                        end % if 2
                                                    end % if 1
                                                end % if 3&4
                                            end % if 2&4
                                        end % if 2&3
                                    end % if 1&4
                                end % if 1&3
                            end % if 1&2
                        end % if 2,3&4
                    end % if 1,3&4
                end % if 1,2&4
            end % if 1,2&3
        end % if all events

        workingIndex = workingIndex + subIndex;

end % while

我想出了一个方法的基础:

temp=bsxfun(@plus, event1', -1000:1000);
matchOneTwo=sum(ismember(event2,temp));
matchOneThree=sum(ismember(event3,temp));
matchOneFour=sum(ismember(event4,temp));
temp=bsxfun(@plus, event2', -1000:1000);

...等 但是由于生成“temp”时缺少内存而失败。任何人都可以帮助其他方法吗?

[编辑]提供Dennis Jaheruddin要求的小规模示例。这是手工制作的,不是真实的数据。周期性只是为了帮助读者了解在comaprisons中使用的事件。

event1=[125, 1500, 5000, 15000, 22349,25000, 35000, 45000, 55000, 60325, 65000, 75000, 85000, 91117, 95000];
event2=[1750, 7000, 17000, 21562, 27000, 37000, 47000, 57000, 60256, 67000, 77000, 87000, 97000];
event3=[1126, 9000, 19000, 29000, 30130, 39000, 49000, 59000, 69000, 79000, 89000, 91560, 99000, 120000];
event4=[1, 1975, 11000, 21000, 31000, 31159, 41000, 51000, 60112, 61000, 71000, 81000, 91000, 91001, 101000, 130000];

然后:

  • 匹配1,2,3和&amp; 4 = 1
  • 比赛1&amp; 2 = 1
  • 比赛3&amp; 4 = 1
  • 匹配1,2和&amp; 4 = 1
  • 匹配1,3和&amp; 4 = 1
  • 所有其余的都没有匹配。

1 个答案:

答案 0 :(得分:0)

基于问题和评论,我认为关键是ismemberf File Exchange submission

它允许您检查一组中哪些值出现在具有一定容差的另一组中。 例如:

% How many elements from event1 are close to an element of event2?
% 3 elements
sum(ismemberf(event1,event2,'tol',1000)) 
% At how many positions are both event1 and event2 are close to an element in event4?
% At exactly 1 position
sum(ismemberf(event1(1:13),event4,'tol',1000)&ismemberf(event2,event4,'tol',1000)) 

可能这不是你需要的全部,但从这里建立起来应该很容易。