我在Python中实现了一个简单的图形数据结构,结构如下。这里的代码只是为了澄清函数/变量的含义,但它们非常明显,因此您可以跳过阅读。
# Node data structure
class Node:
def __init__(self, label):
self.out_edges = []
self.label = label
self.is_goal = False
def add_edge(self, node, weight = 0):
self.out_edges.append(Edge(node, weight))
# Edge data structure
class Edge:
def __init__(self, node, weight = 0):
self.node = node
self.weight = weight
def to(self):
return self.node
# Graph data structure, utilises classes Node and Edge
class Graph:
def __init__(self):
self.nodes = []
# some other functions here populate the graph, and randomly select three goal nodes.
现在我正在尝试实现从给定节点v
开始的uniform-cost search(即具有优先级队列的BFS,保证最短路径),并返回最短路径(以列表形式) )到三个目标节点之一。通过目标节点,我的意思是将属性is_goal
设置为true的节点。
这是我的实施:
def ucs(G, v):
visited = set() # set of visited nodes
visited.add(v) # mark the starting vertex as visited
q = queue.PriorityQueue() # we store vertices in the (priority) queue as tuples with cumulative cost
q.put((0, v)) # add the starting node, this has zero *cumulative* cost
goal_node = None # this will be set as the goal node if one is found
parents = {v:None} # this dictionary contains the parent of each node, necessary for path construction
while not q.empty(): # while the queue is nonempty
dequeued_item = q.get()
current_node = dequeued_item[1] # get node at top of queue
current_node_priority = dequeued_item[0] # get the cumulative priority for later
if current_node.is_goal: # if the current node is the goal
path_to_goal = [current_node] # the path to the goal ends with the current node (obviously)
prev_node = current_node # set the previous node to be the current node (this will changed with each iteration)
while prev_node != v: # go back up the path using parents, and add to path
parent = parents[prev_node]
path_to_goal.append(parent)
prev_node = parent
path_to_goal.reverse() # reverse the path
return path_to_goal # return it
else:
for edge in current_node.out_edges: # otherwise, for each adjacent node
child = edge.to() # (avoid calling .to() in future)
if child not in visited: # if it is not visited
visited.add(child) # mark it as visited
parents[child] = current_node # set the current node as the parent of child
q.put((current_node_priority + edge.weight, child)) # and enqueue it with *cumulative* priority
现在,经过大量测试并与其他算法进行比较后,这个实现似乎运行得很好 - 直到我用这个图表试了一下:
无论出于何种原因,ucs(G,v)
返回路径H -> I
,其成本为0.87,而不是路径H -> F -> I
,成本为0.71(此路径是通过运行DFS获得的)。下图也给出了错误的路径:
该算法提供了G -> F
而不是G -> E -> F
,由DFS再次获得。在这些极少数情况下,我能观察到的唯一模式是所选目标节点始终具有循环。我无法弄清楚出了什么问题。任何提示将不胜感激。
答案 0 :(得分:1)
通常对于搜索,我倾向于保留队列中节点部分的路径。这不是真正的内存效率,但实现起来更便宜。
如果您想要父地图,请记住,当子项位于队列顶部时,更新父地图是安全的。只有这样,算法才能确定到当前节点的最短路径。
library(dplyr)
df %>%
filter(complete.cases(.) & !duplicated(.)) %>%
group_by(column2) %>%
summarize(count = n())
注意:我还没有对此进行过测试,如果它不能立即发挥作用,请随时发表评论。
答案 1 :(得分:0)
在扩展节点之前进行简单的检查可以为您节省重复的访问。
null