我有一个定向网络模型,由一组通过链接连接的节点组成,这些节点随着每个模型迭代而增长。为了找到平均最短路径"在最终的模型迭代中,我实现了Dijkstra算法,该算法计算从所有节点到所有节点的最短路径。更具体地说,该算法计算从每个网络3,000个节点到所有其他3,000个节点(如果存在路径)的最短路径,大约9,000,000个路径长度,然后找到平均路径长度。当我尝试这个时,我的内存耗尽。我能够获得平均路径长度,直到大约500个节点,其中大约250,000个路径长度在12小时内计算。我的问题是,有没有办法以可能提高效率的方式改进代码?或者计算那么多路径是不可行的?
以下代码......算法本身改编自Vogella http://www.vogella.com/tutorials/JavaAlgorithmsDijkstra/article.html
nework中的节点代表树木,边缘或链接代表网络。
Dijstra算法
package network;
imports...
public class DijkstraAlgorithm {
private Context<Object> context;
private Geography<Object> geography;
private int id;
List<Tree> vertices = Tree.getVertices();
List<Nets> edges = Nets.getEdges();
private Set<Tree> settledNodes;
private Set<Tree> unSettledNodes;
private Map<Tree, Tree> predecessors;
private Map<Tree, Integer> distance;
public DijkstraAlgorithm(Graph graph) {
this.context = context;
this.geography = geography;
this.id = id;
this.vertices = vertices;
this.edges = edges;
}
setters and getters....
public void execute(Tree source){
settledNodes = new HashSet<Tree>();
unSettledNodes = new HashSet<Tree>();
distance = new HashMap<Tree, Integer>();
predecessors = new HashMap<Tree, Tree>();
distance.put(source, 0);
unSettledNodes.add(source);
while (unSettledNodes.size()>0){
Tree node = getMinimum(unSettledNodes);
settledNodes.add(node);
unSettledNodes.remove(node);
findMinimalDistances(node);
}
}
private void findMinimalDistances(Tree node){
List<Tree>adjacentNodes = getNeighbors(node);
for (Tree target: adjacentNodes){
if (getShortestDistance(target)>getShortestDistance(node)+getDistance(node,target)){
distance.put(target, getShortestDistance(node) + getDistance(node, target));
predecessors.put(target, node);
unSettledNodes.add(target);
}
}
}
private int getDistance(Tree node, Tree target){
for (Nets edge: edges){
if (edge.getStartTrees().equals(node) && edge.getEndTrees().equals(target)){
return edge.getId();
}
}
throw new RuntimeException("Should not happen");
}
private List<Tree> getNeighbors(Tree node){
List<Tree> neighbors = new ArrayList<Tree>();
for (Nets edge: edges) {
if(edge.getStartTrees().equals(node) && !isSettled(edge.getEndTrees())){
neighbors.add(edge.getEndTrees());
}
}
return neighbors;
}
private Tree getMinimum(Set<Tree>vertexes){
Tree minimum = null;
for (Tree vertex: vertexes) {
if (minimum == null){
minimum = vertex;
} else {
if (getShortestDistance(vertex)< getShortestDistance(minimum)){
minimum = vertex;
}
}
}
return minimum;
}
private boolean isSettled(Tree vertex){
return settledNodes.contains(vertex);
}
private int getShortestDistance(Tree destination) {
Integer d = distance.get(destination);
if (d == null) {
return Integer.MAX_VALUE;
} else {
return d;
}
}
public LinkedList<Tree> getPath(Tree target){
LinkedList<Tree>path = new LinkedList<Tree>();
Tree step = target;
if(predecessors.get(step)== null){
return null;
}
path.add(step);
while (predecessors.get(step)!=null){
step = predecessors.get(step);
path.add(step);
}
Collections.reverse(path);
return path;
}
}
图形
package network;
imports...
public class Graph {
private Context<Object> context;
private Geography<Object> geography;
private int id;
List<Tree> vertices = new ArrayList<>();
List<Nets> edges = new ArrayList<>();
List <Integer> intermediateNodes = new ArrayList<>();
public Graph(Context context, Geography geography, int id, List vertices, List edges) {
this.context = context;
this.geography = geography;
this.id = id;
this.vertices = vertices;
this.edges = edges;
}
setters... getters...
//updates graph
@ScheduledMethod(start =1, interval =1, priority =1)
public void countNodesAndVertices() {
this.setVertices(Tree.getVertices());
this.setEdges(Nets.getEdges());
//run Dijkstra at the 400th iteration of network development
@ScheduledMethod(start =400, priority =1)
public void Dijkstra(){
Graph graph2 = new Graph (context, geography, id, vertices, edges);
graph2.setEdges(this.getEdges());
graph2.setVertices(this.getVertices());
for(Tree t: graph2.getVertices()){
}
DijkstraAlgorithm dijkstra = new DijkstraAlgorithm(graph2);
//create list of pathlengths (links, not nodes)
List <Double> pathlengths = new ArrayList<>();
//go through all nodes as starting nodes
for (int i = 0; i<vertices.size();i++){
//find the shortest path to all nodes as end nodes
for (int j = 0; j<vertices.size();j++){
if(i != j){
Tree startTree = vertices.get(i);
Tree endTree = vertices.get(j);
dijkstra.execute(vertices.get(i));
//create a list that contains the path of nodes
LinkedList<Tree> path = dijkstra.getPath(vertices.get(j));
//if the path is not null and greater than 0
if (path != null && path.size()>0){
//calculate the pathlength (-1, which is the size of the path length of links)
double listsize = path.size()-1;
//add it to the list
pathlengths.add(listsize);
}
}
}
}
calculateAvgShortestPath(pathlengths);
}
//calculate the average
public void calculateAvgShortestPath(List<Double>pathlengths){
Double sum = 0.0;
for (Double cc: pathlengths){
sum+= cc;
}
Double avgPathLength = sum/pathlengths.size();
System.out.println("The average path length is: " + avgPathLength);
}
}
答案 0 :(得分:1)
一个快速的改进是移动线:
dijkstra.execute(vertices.get(i));
向上6行(因此它在i循环中,但不是j循环)。
这应该通过节点数量来改善运行时间(即快3000倍)。
它应该仍然给出相同的结果,因为Dijkstra的算法计算从起始节点到所有目标节点的最短路径,因此不需要为每对开始/结束重新运行它。
答案 1 :(得分:0)
您可以进行多项优化。比如使用Fibonacci堆(甚至是标准的java优先级队列)肯定会加快速度。但是,无论如何大的数据集都可能存在内存问题。处理数据集的唯一真正方法是使用分布式实现。我相信您可以使用Spark Graphx库中的最短路径实现。