假设我有一个列表:[1,3,1,3,1,3,3,2,2,1]
和N=3
(不同的数字)作为输入。我想要的是在此列表中找到包含所有N
的最小子列表的大小在此示例中,大小为4的是[1,3,3,2]
或[3,2,2,1]
;
到目前为止,我为K=[1,2,3]
和List=[1,3,1,3,1,3,3,2,2,1]
做的事情:
append(Subl,_,List),
subset(K,Subl),!,
append(_,Subl2,Subl),
length(Subl2,L),
subset(K,Subl2).
带有第一个子集的第一个追加查找具有所有K个元素的第一个子列表,在这种情况下为[1,3,1,3,1,3,3,2]。之后,我们尝试减小此列表的大小尽可能使它不会停止包含所有K个不同的数字。在这种情况下,最终结果将是[1,3,3,2]且L = 4。我的问题是:找到[1,3, 1,3,1,3,3,2,2] L从8开始到4。每次找到包含K个数字的较小子列表时,如何更新(也许以某种方式存储?)此值? [1,3,3,2](我可以使用+ subset(K,Subl2),结果将是[3,3,2])。我应该如何将List的其余元素添加到[3,3,2 ](这些元素将为[2,2,1])并开始整个过程?
PS:我找到了以前S / O帖子的其他解决方案,但他们找到了从大小3到大小10的所有可能子列表,并检查每个子列表是否包含所有K元素。我认为我们必须在这里使用滑动窗口方法吗?
答案 0 :(得分:0)
我认为以下代码可能对您有用。我还没有证明它的正确性,但是它似乎可以正常工作(即使对于非常大的输入)。无论如何,您可以将其用作正确实施的起点。
shortest_length(List, K, ShortestLength) :-
shortest_sublist(List, K, Sublist),
length(Sublist, ShortestLength).
shortest_sublist(List, K, Sublist) :-
split(List, K, Prefix, Suffix), % find minimum prefix with all K items
shrink(Prefix, [First|Rest]), % shrink that prefix to get a sublist
append(Rest, Suffix, NewList),
( shortest_sublist(NewList, K, NewSublist) % find a new sublist in the rest of the list
-> ( length([First|Rest], Len1),
length(NewSublist, Len2),
( Len1 < Len2
-> Sublist = [First|Rest]) % new sublist is not shorter than the previous
; Sublist = NewSublist ) % new sublist is shorter than the previous
; Sublist = [First|Rest] ). % a new sublist was not found
split(List, K, Prefix, Suffix) :-
append(Prefix, Suffix, List),
has(Prefix, K), !.
has(List, K) :-
forall( between(1, K, Item),
memberchk(Item, List) ).
shrink([First|Rest], ShrunkList) :-
( memberchk(First, Rest)
-> shrink(Rest, ShrunkList)
; ShrunkList = [First|Rest] ).
一些小投入的结果:
?- shortest_length([1,3,1,3,1,3,3,2,2,1], 3, N).
N = 4.
?- shortest_sublist([1,3,1,3,1,3,3,2,2,1], 3, S).
S = [3, 2, 2, 1].
?- shortest_sublist([1,3,1,3,1,3,3,2,2,1,3,3], 3, S).
S = [2, 1, 3].
一些较大输入的结果:
?- length(L, 500000), maplist(random(1,5),L), time(shortest_sublist(L, 4, S)).
% 11,153,796 inferences, 1.766 CPU in -712561273706905600.000 seconds (?% CPU, 6317194 Lips)
L = [2, 1, 3, 4, 2, 2, 4, 4, 3|...],
S = [4, 1, 2, 3].
?- length(L, 1000000), maplist(random(1,5),L), time(shortest_sublist(L, 4, S)).
% 22,349,463 inferences, 3.672 CPU in -657663431226163200.000 seconds (?% CPU, 6086662 Lips)
L = [2, 2, 4, 3, 2, 2, 3, 1, 2|...],
S = [2, 1, 4, 3].
?- length(L, 2000000), maplist(random(1,5),L), time(shortest_sublist(L, 4, S)).
% 44,655,878 inferences, 6.844 CPU in 919641833393356800.000 seconds (0% CPU, 6525060 Lips)
L = [4, 1, 3, 3, 4, 3, 3, 3, 2|...],
S = [2, 1, 3, 4].
对于较小的K值,该算法似乎消耗与O(n)成比例的时间。注意,当列表的长度加倍时,执行时间也加倍(即500000→〜1.8秒,1000000→〜3.7秒,2000000→〜6.9秒)。
我认为瓶颈在谓词has/1
中。因此,对于更高效的实现(对于更大的K值),您需要一种更高效的策略来检查列表成员身份。
答案 1 :(得分:0)
我尝试了另一种方法来处理大型列表。
我使用搜索号码列表和列表中最后出现的索引。
SvgSurface svgSurfaceTest1 = new SvgSurface (path, 500, 500);
我得到的结果与Simvio Lago差不多
getValue(In, Ind, V) :-
nth0(Ind, In, V).
% create the list of the numbers with the index of there last appearance
% -1 if not
% the list is sorted in decreasing order of the index of the numbers
make_indice(U, -1 - U).
minSubList(Min, Max, In, Out) :-
numlist(Min, Max, NL),
maplist(make_indice, NL, Il),
% at the beginning, the length of the sublist is the length of the input !
length(In, Len),
% main predicate of the process
walk(In, 0, Len, Il, 0, Len, VMin, VMax),
% now we get the result
numlist(VMin, VMax, NL1),
maplist(getValue(In), NL1, Out).
% if the list is empty process is finished
walk([], _, _, _IL, Min, Max, Min, Max).
% @arg1 current input to process
% @arg2 index of the head of the input in the initial input
% @arg3 current len of sublist containing all of the numbers
% @arg4 current list of the numbers with there index in the initial list
% @arg5 current first index where we find all the numbers
% @arg6 current last index where we find all the numbers
% @arg7 final first index where we find all the numbers
% @arg8 final last index where we find all the numbers
walk([H|T], N, Len, Il, CurMin, CurMax, Min, Max) :-
% we remove the element of the index list concerning H
select(_-H, Il, IlTemp),
% we build the new list of index H is the first lement of the list
% because he is the last seen !
LstInd = [N-H | IlTemp],
% we need to know the index of the first number seen,
% it is the last of the list
last(LstInd, V - _),
N1 is N+1,
( V = -1
-> % at least one number is not seen
% we keep on this way
walk(T, N1, Len, LstInd, CurMin, CurMax, Min, Max)
; % all the numbers are seen
% we must update the lentgh of the sublist
Len1 is N-V+1,
( Len1 < Len
-> NewLen = Len1,
NewMin = V,
NewMax = N
; NewLen = Len,
NewMin = CurMin,
NewMax = CurMax),
walk(T, N1, NewLen, LstInd, NewMin, NewMax, Min, Max)).
我没有使用nth0来获取结果列表,而是尝试使用此代码 append / 2
?- length(L, 500000), maplist(random(1,5),L), time(minSubList(1,4,L, S)).
% 8,750,508 inferences, 0.922 CPU in 0.922 seconds (100% CPU, 9486338 Lips)
L = [2, 2, 2, 2, 1, 1, 4, 2, 4|...],
S = [3, 2, 1, 4] .
?- length(L, 1000000), maplist(random(1,5),L), time(minSubList(1,4,L, S)).
% 17,502,632 inferences, 3.017 CPU in 3.017 seconds (100% CPU, 5800726 Lips)
L = [4, 3, 3, 4, 1, 1, 4, 4, 3|...],
S = [4, 3, 2, 1] .
?- length(L, 2000000), maplist(random(1,5),L), time(minSubList(1,4,L, S)).
% 34,999,875 inferences, 6.836 CPU in 6.836 seconds (100% CPU, 5119639 Lips)
L = [2, 1, 2, 1, 3, 3, 3, 1, 3|...],
S = [4, 3, 2, 1] .
持续时间更长。 我得到这些结果:
minSubList(Min, Max, In, Out) :-
numlist(Min, Max, NL),
maplist(make_indice, NL, Il),
length(In, Len),
walk(In, 0, Len, Il, 0, Len, VMin, VMax),
Len2 is VMax - VMin +1,
length(W, VMin),
length(Out, Len2),
append([W, Out, _], In).