Orientdb - 具有数百万个顶点的SQL查询导致Java OutOfMemory错误

时间:2016-05-24 14:46:45

标签: java out-of-memory orientdb

我需要在类V1的所有顶点和V2类的所有顶点之间创建边。我的课程每个都有2-3百万个顶点。带有SELECT * FROM V1的SELECT for循环,SELECT * FROM V2给出了一个Java OutOfMemory(堆空间)错误(见下文)。这是一个脱机过程,如果需要(不是频繁操作)将执行一次或两次,因为用户不会定期更新图表,只有我自己。 我怎么能批量(使用SELECT ... LIMIT或g.getvertices())来避免这种情况?

这是我的代码:

        OrientGraphNoTx G = MyOrientDBFactory.getNoTx();
        G.setUseLightweightEdges(false);
        G.declareIntent(new OIntentMassiveInsert());

        for (Vertex p1 : (Iterable<Vertex>) EG.command( new OCommandSQL("SELECT * FROM V1")).execute()) 
        {
            for (Vertex p2 : (Iterable<Vertex>) EG.command( new OCommandSQL("SELECT * FROM V2")).execute()) 
            {
                if (p1.getProperty("prop1")==p2.getProperty("prop1")
                {
                    //p1.addEdge("MyEdge", p2);
                    EG.command( new OCommandSQL("create edge MyEdge from" + p1.getId() +"to "+  p2.getId() + " retry 100") ).execute ();
                }
            }
        }
        G.shutdown();

OrientDB 2.1.5 with Java / Graph API

带有VM选项的NetBeans 8.1 -Xmx4096m和-Dstorage.diskCache.bufferSize = 7200

控制台中的错误消息:

  

2016-05-24 15:48:06:112 INFO {db = MyDB} [提示]查询'SELECT * FROM   V1'返回的结果集超过10000条记录。检查是否   你确实需要所有这些记录,或者通过使用a来减少结果集   限制改善性能和使用RAM   [OProfilerStub] java.lang.OutOfMemoryError:Java堆空间转储   堆到java_pid7896.hprof ...

Netbeans输出中的错误消息

  

线程“main”中的异常   com.orientechnologies.orient.enterprise.channel.binary.OResponseProcessingException:   响应处理期间的异常。在   com.orientechnologies.orient.enterprise.channel.binary.OChannelBinaryAsynchClient.throwSerializedException(OChannelBinaryAsynchClient.java:443)     在   com.orientechnologies.orient.enterprise.channel.binary.OChannelBinaryAsynchClient.handleStatus(OChannelBinaryAsynchClient.java:398)     在   com.orientechnologies.orient.enterprise.channel.binary.OChannelBinaryAsynchClient.beginResponse(OChannelBinaryAsynchClient.java:282)     在   com.orientechnologies.orient.enterprise.channel.binary.OChannelBinaryAsynchClient.beginResponse(OChannelBinaryAsynchClient.java:171)     在   com.orientechnologies.orient.client.remote.OStorageRemote.beginResponse(OStorageRemote.java:2166)     在   com.orientechnologies.orient.client.remote.OStorageRemote.command(OStorageRemote.java:1189)     在   com.orientechnologies.orient.client.remote.OStorageRemoteThread.command(OStorageRemoteThread.java:444)     在   com.orientechnologies.orient.core.command.OCommandRequestTextAbstract.execute(OCommandRequestTextAbstract.java:63)     在   com.tinkerpop.blueprints.impls.orient.OrientGraphCommand.execute(OrientGraphCommand.java:49)     在xx.xxx.xxx.xx.MyEdge。(MyEdge.java:40)at   xx.xxx.xxx.xx.GMain.main(GMain.java:60)引起:   java.lang.OutOfMemoryError:超出GC开销限制

1 个答案:

答案 0 :(得分:0)

作为一种解决方法,您可以使用类似于以下内容的代码

Iterable<Vertex> cv1= g.command( new OCommandSQL("SELECT count(*) FROM V1")).execute();
long counterv1=cv1.iterator().next().getProperty("count");

int[] ids=g.getRawGraph().getMetadata().getSchema().getClass("V1").getClusterIds();

long repeat=counterv1/10000;
long rest=counterv1-(repeat*10000);

List<Vertex> v1=new ArrayList<Vertex>();
int rid=0;
for(int i=0;i<repeat;i++){
    Iterable<Vertex> v= g.command( new OCommandSQL("SELECT * FROM V1 WHERE @rid >= " + ids[0] + ":" + rid  + " limit 10000")).execute();
    CollectionUtils.addAll(v1, v.iterator());
    rid=10000*(i+1);
}
if(rest>0){
    Iterable<Vertex> v=g.command( new OCommandSQL("SELECT * FROM V1 WHERE @rid >= " + ids[0] + ":" + rid + " limit "+ rest)).execute();
    CollectionUtils.addAll(v1, v.iterator());
}

希望它有所帮助。