我有一个数据文件,它将分类数据作为列。像
node_id,second_major,gender,major_index,year,dorm,high_school,student_fac
0,0,2,257,2007,111,2849,1
1,0,2,271,2005,0,51195,2
2,0,2,269,2007,0,21462,1
3,269,1,245,2008,111,2597,1
..........................
此数据位于列中。我将其转换为edgelist和nodelist。 边缘列表如:
0 4191
0 949
1 3002
1 4028
1 957
2 2494
2 959
2 3011
3 4243
4 965
5 1478
........
........
要找到节点之间的最短路径,究竟需要做些什么。边缘没有重量。我如何在python中实现这个代码?
答案 0 :(得分:2)
这是一个经典的广度优先搜索问题,您有一个无向,未加权的图表,并且您希望找到2个顶点之间的最短路径。
关于广度优先搜索的一些有用链接:
您需要注意的一些边缘情况:
我认为你的边缘列表是列表或列表列表,例如
[[4191, 949], [3002, 4028, 957], [2494, 959, 3011], [4243, 965], [1478], ...]
或者
{ 0: [4191, 949],
1: [3002, 4028, 957],
2: [2494, 959, 3011],
3: [4243, 965],
4: [1478], ...}
我已经编写了一些代码来展示广度优先搜索的工作原理:
import sys
import sys
import Queue
def get_shortest_path(par, src, dest):
'''
Returns the shortest path as a list of integers
par - parent information
src - source vertex
dest - destination vertex
'''
if dest == src:
return [src]
else:
ret = get_shortest_path(par, src, par[dest])
ret.append(dest)
return ret
def bfs(edgeList, src, dest):
'''
Breadth first search routine. Returns (distance, shortestPath) pair from src to dest. Returns (-1, []) if there is no path from src to dest
edgeList - adjacency list of graph. Either list of lists or dict of lists
src - source vertex
dest - destination vertex
'''
vis = set() # stores the vertices that have been visited
par = {} # stores parent information. vertex -> parent vertex in BFS tree
distDict = {} # stores distance of visited vertices from the source. This is the number of edges between the source vertex and the given vertex
q = Queue.Queue()
q.put((src, 0)) # enqueue (source, distance) pair
par[src] = -1 # source has no parent
vis.add(src) # minor technicality, will explain later
while not q.empty():
(v,dist) = q.get() # grab vertex in queue
distDict[v] = dist # update the distance
if v == dest:
break # reached destination, done
nextDist = dist+1
for nextV in edgeList[v]:
# visit vertices adjacent to the current vertex
if nextV not in vis:
# not yet visited
par[nextV] = v # update parent of nextV to v
q.put((nextV, nextDist)) # add into queeu
vis.add(nextV) # mark as visited
# obtained shortest path now
if dest in distDict:
return (distDict[dest], get_shortest_path(par, src, dest))
else:
return (-1, []) # no shortest path
# example run, feel free to remove this
if __name__ == '__main__':
edgeList = {
0: [6,],
1: [2, 7],
2: [1, 3, 6],
3: [2, 4, 5],
4: [3, 8],
5: [3, 7],
6: [0, 2],
7: [1, 5],
8: [4],
}
while True:
src = int(sys.stdin.readline())
dest = int(sys.stdin.readline())
(dist, shortest_path) = bfs(edgeList, src, dest)
print 'dist =', dist
print 'shortest_path =', shortest_path