在我的网络扩散研究中,我有以下代码为顶点建模轻量级框架。最初的原型来自python中的一个框架,我将其翻译成Java。我遇到的问题是,虽然这个代码比python版本运行速度快10000个顶点,但对于更多的顶点(100,000+),它会停止运行。事实上,python版本在1.2分钟内执行,而java版本甚至在执行7分钟后也没有返回。我不确定为什么相同的代码在大量顶点处分解,我需要帮助修复代码。
import java.util.*;
public class Vertex
{
private int id;
private HashMap<Integer, Double> connectedTo;
private int status;
public Vertex(int key)
{
this.id = key;
this.connectedTo = new HashMap<Integer, Double>();
this.status = 0;
}
public void addNeighbour(int nbr, double weight)
{
this.connectedTo.put(nbr, weight);
}
public int getId()
{
return this.id;
}
public double getWeight(int nbr)
{
return this.connectedTo.get(nbr);
}
public int getStatus()
{
return this.status;
}
public Set<Integer> getConnections()
{
return this.connectedTo.keySet();
}
//testing the class
public static void main(String[] args)
{
int noOfVertices = 100000;
Vertex[] vertexList = new Vertex[noOfVertices];
for (int i = 0; i < noOfVertices; i++) {
vertexList[i] = new Vertex(i);
}
for (Vertex v : vertexList) {
int degree = (int)(500*Math.random()); //random choice of degree
int neighbourCount = 0; // count number of neighbours built up
while (neighbourCount <= degree) {
int nbr = (int) (noOfVertices * Math.random()); // randomly choose a neighbour
double weight = Math.random(); // randomly assign a weight for the relationship
v.addNeighbour(nbr, weight);
neighbourCount++;
}
}
}
}
作为参考,此代码的python版本如下:
import random
class Vertex:
def __init__(self, key):
self.id = key
self.connectedTo = {}
def addNeighbor(self, nbr, weight=0):
self.connectedTo[nbr] = weight
def __str__(self):
return str(self.id) + ' connectedTo: ' \
+ str([x.id for x in self.connectedTo])
def getConnections(self):
return self.connectedTo.keys()
def getId(self):
return self.id
def getWeight(self, nbr):
return self.connectedTo[nbr]
if __name__ == '__main__':
numberOfVertices = 100000
vertexList = [Vertex(i) for i in range(numberOfVertices)] # list of vertices
for vertex in vertexList:
degree = 500*random.random()
# build up neighbors one by one
neighbourCount = 0
while neighbourCount <= degree:
neighbour = random.choice(range(numberOfVertices))
weight = random.random() # random choice of weight
vertex.addNeighbor(neighbour, weight)
neighbourCount = neighbourCount + 1
答案 0 :(得分:0)
这是一个非常有趣的问题,我相信我也学到了一些新东西。我尝试以不同的方式优化代码,例如使用并行流以及使用ThreadLocalRandom
,其速度比Random
快三倍。但是,我终于发现了主要的瓶颈:为JVM分配了内存。
因为你的Map
中添加了很多元素(最坏的情况是500,000个顶点有100,000个),所以你需要大量的内存(堆空间)。如果允许JVM动态分配内存,则程序将花费很长时间来执行。我解决这个问题的方法是通过将-Xms3G
作为VM参数应用到程序的运行配置中来预先为JVM(特别是3 GB)分配内存,这可以在IDE中或通过终端
我还优化了您的代码,我将在下面发布(它只需几秒即可完成):
import java.util.*;
import java.util.concurrent.*;
import java.util.stream.*;
public class Test {
private static final ThreadLocalRandom RANDOM = ThreadLocalRandom.current();
public static void main(String[] args) {
int noOfVertices = 100_000;
Vertex[] vertexList = new Vertex[noOfVertices];
IntStream.range(0, noOfVertices).parallel().forEachOrdered(i -> {
vertexList[i] = new Vertex(i);
int degree = (int) (500 * RANDOM.nextDouble()); // random choice of degree
for (int j = 0; j <= degree; j++) {
int nbr = (int) (noOfVertices * RANDOM.nextDouble()); // randomly choose a neighbor
vertexList[i].addNeighbour(nbr, RANDOM.nextDouble());
}
});
}
}
class Vertex {
private int id;
private Map<Integer, Double> connectedTo;
private int status;
public Vertex(int id) {
this.id = id;
this.connectedTo = new HashMap<>(500);
}
public void addNeighbour(int nbr, double weight) {
this.connectedTo.put(nbr, weight);
}
public int getId() {
return this.id;
}
public double getWeight(int nbr) {
return this.connectedTo.get(nbr);
}
public int getStatus() {
return this.status;
}
public Set<Integer> getConnections() {
return this.connectedTo.keySet();
}
}
我不确定在多线程环境中使用ThreadLocalRandom
的明确后果,但如果您愿意,可以将其切换回Math#random
。