我不知道在这里问这个问题是否合适,如果不是,那就很抱歉。
我有一个序列ALPHA,例如:
A B D Z A B X
我得到了ALPHA的子序列列表,例如:
A B D
B D
A B
D Z
A
B
D
Z
X
我搜索找到重构ALPHA的脱节子序列的最小长度的算法,例如在我们的例子中:
{A B D} {Z} {A B} {X}
有什么想法吗?我的猜测已经存在。
答案 0 :(得分:1)
您可以将此问题转换为在图表中查找最小路径。
节点将对应于字符串的前缀,包括空字符串的前缀。如果存在允许的子序列,则从节点A到节点B将存在边缘,当附加到字符串预置A时,结果是字符串预置B.
现在问题转化为从对应于空字符串的节点开始查找图中的最小路径,并以对应于整个输入字符串的节点结束。
您现在可以应用例如 BFS(因为边缘具有统一的成本),或者应用Dijkstra算法来查找此路径。
以下python代码是基于以上原则的实现:
def reconstruct(seq, subseqs):
n = len(seq)
d = dict()
for subseq in subseqs:
d[subseq] = True
# in this solution, the node with value v will correspond
# to the substring seq[0: v]. Thus node 0 corresponds to the empty string
# and node n corresponds to the entire string
# this will keep track of the predecessor for each node
predecessors = [-1] * (n + 1)
reached = [False] * (n + 1)
reached[0] = True
# initialize the queue and add the first node
# (the node corresponding to the empty string)
q = []
qstart = 0
q.append(0)
while True:
# test if we already found a solution
if reached[n]:
break
# test if the queue is empty
if qstart > len(q):
break
# poll the first value from the queue
v = q[qstart]
qstart += 1
# try appending a subsequence to the current node
for n2 in range (1, n - v + 1):
# the destination node was already added into the queue
if reached[v + n2]:
continue
if seq[v: (v + n2)] in d:
q.append(v + n2)
predecessors[v + n2] = v
reached[v + n2] = True
if not reached[n]:
return []
# reconstruct the path, starting from the last node
pos = n
solution = []
while pos > 0:
solution.append(seq[predecessors[pos]: pos])
pos = predecessors[pos]
solution.reverse()
return solution
print reconstruct("ABDZABX", ["ABD", "BD", "AB", "DZ", "A", "B", "D", "Z", "X"])
我没有太多使用python的经验,这是我更喜欢坚持基础知识的主要原因(例如实现一个带有列表+开头索引的队列)。