我正在更新一些继承的遗留代码,并找到了一个有趣的“方法”来解决我无法优化的问题。
有一组数据可以使用以下方法进行模拟:
approximateEntries = 5e6;
maxEntryValue = 5e12;
maxExtraEvents = 10e3;
event1 = randperm( maxEntryValue, approximateEntries + randi( maxExtraEvents ) );
event1 = sort(event1);
event2 = randperm( maxEntryValue, approximateEntries + randi( maxExtraEvents ) );
event2 = sort(event2);
event3 = randperm( maxEntryValue, approximateEntries + randi( maxExtraEvents ) );
event3 = sort(event3);
event4 = randperm( maxEntryValue, approximateEntries + randi( maxExtraEvents ) );
event4 = sort(event4);
data = [[ones(length(event1),1); 2*ones(length(event2),1); 3*ones(length(event3),1); 4*ones(length(event4),1)], ...
[event1'; event2'; event3'; event4']];
data = sortrows(data,2);
clear approximateEntries;
clear maxEntryValue;
clear maxExtraEvents;
也就是说,由于之前编写代码的方式,我可以访问四个数组,每个数组包含大约五百万个元素,一个包含四个数组的大数组连接,排序并标记在另一列中,使用哪个数组元素最初来自。如果我可以避免在解决这个问题的过程中使用单个大型数组,那么可以从代码中删除它,因为以后不需要它。
我们希望找到四个“事件”矩阵中值之间的近似(到+/- 1000之内)匹配的不同特征。值可能不一定在相同的索引处,但每个“事件”矩阵将按升序排序。我们正在寻找:
遗留代码尝试以下面这种令人难以置信的方式执行此操作(我知道这不完全正确,这就是它正在更新的原因)
onlyEvent1 = 0;
onlyEvent2 = 0;
onlyEvent3 = 0;
onlyEvent4 = 0;
onlyEvent1Event2 = 0;
onlyEvent1Event3 = 0;
onlyEvent1Event4 = 0;
onlyEvent2Event3 = 0;
onlyEvent2Event4 = 0;
onlyEvent3Event4 = 0;
onlyEvent1Event2Event3 = 0;
onlyEvent1Event2Event4 = 0;
onlyEvent1Event3Event4 = 0;
onlyEvent2Event3Event4 = 0;
allEvents = 0;
maxIndex = length(data);
workingIndex = 1;
fullGate = 2000;
while workingIndex < maxIndex
subIndex = 0;
withinRange = true;
% Prepare a buffer
buffer = zeros(200,2);
while ((workingIndex + subIndex) < maxIndex) && (withinRange == true)
if subIndex > 0
timeDifference = data(workingIndex + subIndex,2) - buffer(subIndex,2);
if timeDifference <= fullGate
buffer(subIndex + 1,:) = data(workingIndex + subIndex,:);
subIndex = subIndex + 1;
else
withinRange = false;
end % if
else
buffer(subIndex + 1,:) = data(workingIndex + subIndex,:);
subIndex = subIndex + 1;
end % if
end % while
event1 = false;
event2 = false;
event3 = false;
event4 = false;
compareIndex = 1;
while (buffer(compareIndex) ~= 0) && (compareIndex < length(buffer))
if buffer(compareIndex,1) == 1
event1 = true;
else
if buffer(compareIndex,1) == 2
event2 = true;
else
if buffer(compareIndex,1) == 3
event3 = true;
else
% Should really only be four
event4 = true;
end % if is 3
end % if is 2
end % if is 1
compareIndex = compareIndex + 1;
end % while buffer
if (event1 == true) && (event2 == true) && (event3 == true) && (event4 == true)
allEvents = allEvents + 1;
else
if (event1 == true) && (event2 == true) && (event3 == true) && (event4 == false)
onlyEvent1Event2Event3 = onlyEvent1Event2Event3 + 1;
else
if (event1 == true) && (event2 == true) && (event3 == false) && (event4 == true)
onlyEvent1Event2Event4 = onlyEvent1Event2Event4 + 1;
else
if (event1 == true) && (event2 == false) && (event3 == true) && (event4 == true)
onlyEvent1Event3Event4 = onlyEvent1Event3Event4 + 1;
else
if (event1 == false) && (event2 == true) && (event3 == true) && (event4 == true)
onlyEvent2Event3Event4 = onlyEvent2Event3Event4 + 1;
else
if (event1 == true) && (event2 == true) && (event3 == false) && (event4 == false)
onlyEvent1Event2 = onlyEvent1Event2 + 1;
else
if (event1 == true) && (event2 == false) && (event3 == true) && (event4 == false)
onlyEvent1Event3 = onlyEvent1Event3 + 1;
else
if (event1 == true) && (event2 == false) && (event3 == false) && (event4 == true)
onlyEvent1Event4 = onlyEvent1Event4 + 1;
else
if (event1 == false) && (event2 == true) && (event3 == true) && (event4 == false)
onlyEvent2Event3 = onlyEvent2Event3 + 1;
else
if (event1 == false) && (event2 == true) && (event3 == false) && (event4 == true)
onlyEvent2Event4 = onlyEvent2Event4 + 1;
else
if (event1 == false) && (event2 == false) && (event3 == true) && (event4 == true)
onlyEvent3Event4 = onlyEvent3Event4 + 1;
else
if (event1 == true) && (event2 == false) && (event3 == false) && (event4 == false)
onlyEvent1 = onlyEvent1 + 1;
else
if (event1 == false) && (event2 == true) && (event3 == false) && (event4 == false)
onlyEvent2 = onlyEvent2 + 1;
else
if (event1 == false) && (event2 == false) && (event3 == true) && (event4 == false)
onlyEvent3 = onlyEvent3 + 1;
else
onlyEvent4 = onlyEvent4 + 1;
end % if 3
end % if 2
end % if 1
end % if 3&4
end % if 2&4
end % if 2&3
end % if 1&4
end % if 1&3
end % if 1&2
end % if 2,3&4
end % if 1,3&4
end % if 1,2&4
end % if 1,2&3
end % if all events
workingIndex = workingIndex + subIndex;
end % while
我想出了一个方法的基础:
temp=bsxfun(@plus, event1', -1000:1000);
matchOneTwo=sum(ismember(event2,temp));
matchOneThree=sum(ismember(event3,temp));
matchOneFour=sum(ismember(event4,temp));
temp=bsxfun(@plus, event2', -1000:1000);
...等 但是由于生成“temp”时缺少内存而失败。任何人都可以帮助其他方法吗?
[编辑]提供Dennis Jaheruddin要求的小规模示例。这是手工制作的,不是真实的数据。周期性只是为了帮助读者了解在comaprisons中使用的事件。
event1=[125, 1500, 5000, 15000, 22349,25000, 35000, 45000, 55000, 60325, 65000, 75000, 85000, 91117, 95000];
event2=[1750, 7000, 17000, 21562, 27000, 37000, 47000, 57000, 60256, 67000, 77000, 87000, 97000];
event3=[1126, 9000, 19000, 29000, 30130, 39000, 49000, 59000, 69000, 79000, 89000, 91560, 99000, 120000];
event4=[1, 1975, 11000, 21000, 31000, 31159, 41000, 51000, 60112, 61000, 71000, 81000, 91000, 91001, 101000, 130000];
然后:
答案 0 :(得分:0)
基于问题和评论,我认为关键是ismemberf
File Exchange submission
它允许您检查一组中哪些值出现在具有一定容差的另一组中。 例如:
% How many elements from event1 are close to an element of event2?
% 3 elements
sum(ismemberf(event1,event2,'tol',1000))
% At how many positions are both event1 and event2 are close to an element in event4?
% At exactly 1 position
sum(ismemberf(event1(1:13),event4,'tol',1000)&ismemberf(event2,event4,'tol',1000))
可能这不是你需要的全部,但从这里建立起来应该很容易。