如果我有一个像abcabcabc
这样的字符串,那么显然,abc
是一种模式。我想用c / c ++找出模式。
我不想要实施。伪代码/算法就可以了。
我该怎么办?
答案 0 :(得分:1)
使用 floyd循环查找算法。这使用慢速类比来查找循环。维基百科提供的Python源代码:
def floyd(f, x0):
# Main phase of algorithm: finding a repetition x_i = x_2i
# The hare moves twice as quickly as the tortoise and
# the distance between them increases by 1 at each step.
# Eventually they will both be inside the cycle and then,
# at some point, the distance between them will be
# divisible by the period λ.
tortoise = f(x0) # f(x0) is the element/node next to x0.
hare = f(f(x0))
while tortoise != hare:
tortoise = f(tortoise)
hare = f(f(hare))
# At this point the tortoise position, ν, which is also equal
# to the distance between hare and tortoise, is divisible by
# the period λ. So hare moving in circle one step at a time,
# and tortoise (reset to x0) moving towards the circle, will
# intersect at the beginning of the circle. Because the
# distance between them is constant at 2ν, a multiple of λ,
# they will agree as soon as the tortoise reaches index μ.
# Find the position μ of first repetition.
mu = 0
tortoise = x0
while tortoise != hare:
tortoise = f(tortoise)
hare = f(hare) # Hare and tortoise move at same speed
mu += 1
# Find the length of the shortest cycle starting from x_μ
# The hare moves one step at a time while tortoise is still.
# lam is incremented until λ is found.
lam = 1
hare = f(tortoise)
while tortoise != hare:
hare = f(hare)
lam += 1
return lam, mu
此解决方案的时间复杂度为O(λ, μ)
,辅助空间为O(1)
。
答案 1 :(得分:0)
尝试查找:http://en.wikipedia.org/wiki/Cycle_detection 不要把它想象成一个字符串,而是找一个句号。它是否是一个字符串并不重要。
答案 2 :(得分:0)
找出一个模式的一种方法是使用Knuth-Morris-Pratt's algorithm的预计算算法,其时间复杂度为O(P.length),其中P是给定字符串,用于计算查找表< strong>&#39; PI&#39; ,其中包含与其相应前缀匹配的最长后缀的长度(&#34; a&#34;,&#34; ab&#34;,&#34; abc&#34;,...)。
伪代码取自算法导论,CLRS。此外, Linux有一个不错的implementation上述算法。
因此, P.length-PI [P.length] = k =最小重复模式的长度。请记住,k将始终保持在[0,P.length]范围内。
例如,&#34; abcabcabc&#34; = PI [0,0,0,1,2,3,4,5,6]。这里,最小重复模式的长度为9 - 6 = 3.但是k是否将字符串平均分配?
因此,如果P.length mod k == 0? P [1..k]将是你的重复模式。