如何将重叠的字符串与unicode合并?

时间:2018-10-08 22:24:10

标签: python overlap

我是MCS的学生,正在学习python,陷入一个问题。我正在尝试合并所有重叠的字符串。

我正在使用以下算法,但是输出不符合预期?

(1)查找所有可能的对之间的最大重叠。 (2)将所有重叠存储在字典中,其中键作为重叠量,值作为开始,字符串,字符串b (3)选择最大的重叠集并合并字符串。我已经使用以下代码实现了算法,但无法产生预期的输出。

def overlap(a, b):

    overlaps = []

    for i in range(len(b)):
        for j in range(len(a)):
            if a.endswith(b[:i + 1], j):
                overlaps.append((i, j))

    return max(overlaps) if overlaps else (0, -1)


def get_merged_string(lst):
    overlaps = defaultdict(list)
    while len(lst) > 1:
          overlaps.clear()

          for a in lst:
              for b in lst:
                  if a == b:
                     continue

                  amount, start = overlap(a, b)
                  overlaps[amount].append((start, a, b))

          maximum = max(overlaps)

          if maximum == 0:
             break

          start, a, b = choice(overlaps[maximum])  # pick one among equals

          lst.remove(a)
          lst.remove(b)
          lst.append(a[:start] + b)
    str1 = ''.join(lst)
    return (urllib.parse.unquote_plus(urllib.parse.unquote_plus(str1)))

输入:

%23%21%2Fusr%2Fbin%2Fpyth
n%2Fpython3%0A%0A%23%0A%23+
%0A%0A%23%0A%23+Python+fu
+Python+functio
unctions+start+
+start+with+def
th+def.++They+t
hey+take+parame
parameters%2C+whi
+which+are%0A%23+un
are%0A%23+un-typed%2C
n-typed%2C+as+oth
+as+other+varia
her+variables.%0A
es.%0A%0A%23+The+stri
string+at+the+s
the+start+of+th
rt+of+the+funct
function+is+for
n+is+for+docume
documentation.%0A
tation.%0Adef+prh
f+prhello%28%29%3A%0A++
%28%29%3A%0A++++%22Print+
+%22Print+hello%22%0A
hello%22%0A++++prin
+print%28%22Hello%2C+
llo%2C+World%21%22%29%0A%0A
World%21%22%29%0A%0Aprhel
%29%0A%0Aprhello%28%29%0A%0A%23
%28%29%0A%0A%23%0A%23%0Adef+prl
f+prlines%28str%2C+
ines%28str%2C+num%29%3A
num%29%3A%0A++++%22Prin
++++%22Print+num+
nt+num+lines+co
ines+consisting
onsisting+of+st
ing+of+str%2C+rep
+str%2C+repeating
epeating+str+on
r+once+more+on+
+on+each+line.%22
ine.%22%0A++++for+n
+for+n+in+range
in+range%280%2Cnum%29
num%29%3A%0A++++++++p
++++print%28str+%2A
%28str+%2A+%28n+%2B+1%29%29
+%28n+%2B+1%29%29%0A%0Aprli
+1%29%29%0A%0Aprlines%28%27
rlines%28%27z%27%2C+5%29%0A
%2C+5%29%0Aprint%28%29%0Apr
print%28%29%0Aprlines
rlines%28%27fred+%27%2C
red+%27%2C+4%29%0A

我的输出:

hello()

#
#
def prlines(str, num):
    "Print hello"
    print("Hello, World!")

prhellhe string at the start of the functions start with def.  They take parameters, which are
# un-typed, as other variables.

# The s#!/usr/bin/python3

#
# Python function is for documentation.
def prhello():
    "Print num lines consisting of str, repeating str once more on each line."
    for n in range(0,num):
        print(str * (n   1))

prlines('z', 5)
print()
prlines('fred ', 4)

预期输出:是在合并重叠的字符串之后。

#!/usr/bin/python3

#
# Python functions start with def.  They take parameters, which are
# un-typed, as other variables.

# The string at the start of the function is for documentation.
def prhello():
    "Print hello"
    print("Hello, World!")

prhello()

#
#
def prlines(str, num):
    "Print num lines consisting of str, repeating str once more on each line."
    for n in range(0,num):
        print(str * (n + 1))

prlines('z', 5)
print()
prlines('fred ', 4)

上述问题是由重叠的歧义引起的。我该如何解决这个问题?

0 个答案:

没有答案