从纱线边缘读取字节时出现OutOfMemory错误

时间:2015-08-20 05:07:02

标签: hadoop yarn giraph

我正在使用纱线进行BFS算法,并为顶点数据(顶点数据)创建自定义值。但是,在我这样做之后,读取边缘的过程出了点问题。

我将错误跟踪到以下代码行:

  • 在ByteArrayEdges中,变量 serializedEdgesBytesUsed 获取值1987015248并在分配新数组时出现OutOfMemory错误(java限制为64K,就我而言知道)

    @Override
    public void readFields(DataInput in) throws IOException {
    serializedEdgesBytesUsed = in.readInt();
    if (serializedEdgesBytesUsed > 0) {
      // Only create a new buffer if the old one isn't big enough
      if (serializedEdges == null ||
          serializedEdgesBytesUsed > serializedEdges.length) {
        serializedEdges = new byte[serializedEdgesBytesUsed];
      }
      in.readFully(serializedEdges, 0, serializedEdgesBytesUsed);
    }
    edgeCount = in.readInt();
    

    }

我不确定为什么会这样,但在使用自定义顶点数据之前,此问题不存在。

完整的日志在这里(我直接从eclipse进行测试,因为在伪分布式集群中要困难得多):

2015-08-20 01:52:21,103 INFO  [LocalJobRunner Map Task Executor #0] utils.ProgressableUtils (ProgressableUtils.java:waitFor(315)) - waitFor: Future result not ready yet java.util.concurrent.FutureTask@b2dd686
2015-08-20 01:52:21,103 INFO  [LocalJobRunner Map Task Executor #0] utils.ProgressableUtils (ProgressableUtils.java:waitFor(197)) - waitFor: Waiting for org.apache.giraph.utils.ProgressableUtils$FutureWaitable@6e5efd25
2015-08-20 01:53:12,527 ERROR [LocalJobRunner Map Task Executor #0] graph.GraphMapper (GraphMapper.java:run(101)) - Caught an unrecoverable exception waitFor: ExecutionException occurred while waiting for org.apache.giraph.utils.ProgressableUtils$FutureWaitable@6e5efd25
java.lang.IllegalStateException: waitFor: ExecutionException occurred while waiting for org.apache.giraph.utils.ProgressableUtils$FutureWaitable@6e5efd25
    at org.apache.giraph.utils.ProgressableUtils.waitFor(ProgressableUtils.java:193)
    at org.apache.giraph.utils.ProgressableUtils.waitForever(ProgressableUtils.java:151)
    at org.apache.giraph.utils.ProgressableUtils.waitForever(ProgressableUtils.java:136)
    at org.apache.giraph.utils.ProgressableUtils.getFutureResult(ProgressableUtils.java:99)
    at org.apache.giraph.utils.ProgressableUtils.getResultsWithNCallables(ProgressableUtils.java:233)
    at org.apache.giraph.worker.BspServiceWorker.loadInputSplits(BspServiceWorker.java:316)
    at org.apache.giraph.worker.BspServiceWorker.loadVertices(BspServiceWorker.java:409)
    at org.apache.giraph.worker.BspServiceWorker.setup(BspServiceWorker.java:629)
    at org.apache.giraph.graph.GraphTaskManager.execute(GraphTaskManager.java:284)
    at org.apache.giraph.graph.GraphMapper.run(GraphMapper.java:93)
    at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:340)
    at org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:243)
    at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
    at java.util.concurrent.FutureTask.run(FutureTask.java:262)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
    at java.lang.Thread.run(Thread.java:745)
Caused by: java.util.concurrent.ExecutionException: java.lang.OutOfMemoryError: Java heap space
    at java.util.concurrent.FutureTask.report(FutureTask.java:122)
    at java.util.concurrent.FutureTask.get(FutureTask.java:202)
    at org.apache.giraph.utils.ProgressableUtils$FutureWaitable.waitFor(ProgressableUtils.java:312)
    at org.apache.giraph.utils.ProgressableUtils.waitFor(ProgressableUtils.java:185)
    ... 17 more
Caused by: java.lang.OutOfMemoryError: Java heap space
    at org.apache.giraph.edge.ByteArrayEdges.readFields(ByteArrayEdges.java:193)
    at org.apache.giraph.utils.WritableUtils.reinitializeVertexFromDataInput(WritableUtils.java:541)
    at org.apache.giraph.utils.VertexIterator.next(VertexIterator.java:98)
    at org.apache.giraph.partition.BasicPartition.addPartitionVertices(BasicPartition.java:99)
    at org.apache.giraph.comm.requests.SendWorkerVerticesRequest.doRequest(SendWorkerVerticesRequest.java:115)
    at org.apache.giraph.comm.netty.NettyWorkerClientRequestProcessor.doRequest(NettyWorkerClientRequestProcessor.java:466)
    at org.apache.giraph.comm.netty.NettyWorkerClientRequestProcessor.flush(NettyWorkerClientRequestProcessor.java:412)
    at org.apache.giraph.worker.InputSplitsCallable.call(InputSplitsCallable.java:241)
    at org.apache.giraph.worker.InputSplitsCallable.call(InputSplitsCallable.java:60)
    at org.apache.giraph.utils.LogStacktraceCallable.call(LogStacktraceCallable.java:51)
    ... 4 more
2015-08-20 01:53:12,532 ERROR [LocalJobRunner Map Task Executor #0] worker.BspServiceWorker (BspServiceWorker.java:unregisterHealth(777)) - unregisterHealth: Got failure, unregistering health on /_hadoopBsp/job_local1113753160_0001/_applicationAttemptsDir/0/_superstepDir/-1/_workerHealthyDir/localhost_0 on superstep -1
2015-08-20 01:53:12,558 INFO  [Thread-13] mapred.LocalJobRunner (LocalJobRunner.java:runTasks(456)) - map task executor complete.
2015-08-20 01:53:12,562 WARN  [Thread-13] mapred.LocalJobRunner (LocalJobRunner.java:run(560)) - job_local1113753160_0001
java.lang.Exception: java.lang.IllegalStateException: run: Caught an unrecoverable exception waitFor: ExecutionException occurred while waiting for org.apache.giraph.utils.ProgressableUtils$FutureWaitable@6e5efd25
    at org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:462)
    at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:522)
Caused by: java.lang.IllegalStateException: run: Caught an unrecoverable exception waitFor: ExecutionException occurred while waiting for org.apache.giraph.utils.ProgressableUtils$FutureWaitable@6e5efd25
    at org.apache.giraph.graph.GraphMapper.run(GraphMapper.java:104)
    at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:340)
    at org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:243)
    at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
    at java.util.concurrent.FutureTask.run(FutureTask.java:262)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
    at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.IllegalStateException: waitFor: ExecutionException occurred while waiting for org.apache.giraph.utils.ProgressableUtils$FutureWaitable@6e5efd25
    at org.apache.giraph.utils.ProgressableUtils.waitFor(ProgressableUtils.java:193)
    at org.apache.giraph.utils.ProgressableUtils.waitForever(ProgressableUtils.java:151)
    at org.apache.giraph.utils.ProgressableUtils.waitForever(ProgressableUtils.java:136)
    at org.apache.giraph.utils.ProgressableUtils.getFutureResult(ProgressableUtils.java:99)
    at org.apache.giraph.utils.ProgressableUtils.getResultsWithNCallables(ProgressableUtils.java:233)
    at org.apache.giraph.worker.BspServiceWorker.loadInputSplits(BspServiceWorker.java:316)
    at org.apache.giraph.worker.BspServiceWorker.loadVertices(BspServiceWorker.java:409)
    at org.apache.giraph.worker.BspServiceWorker.setup(BspServiceWorker.java:629)
    at org.apache.giraph.graph.GraphTaskManager.execute(GraphTaskManager.java:284)
    at org.apache.giraph.graph.GraphMapper.run(GraphMapper.java:93)
    ... 8 more
Caused by: java.util.concurrent.ExecutionException: java.lang.OutOfMemoryError: Java heap space
    at java.util.concurrent.FutureTask.report(FutureTask.java:122)
    at java.util.concurrent.FutureTask.get(FutureTask.java:202)
    at org.apache.giraph.utils.ProgressableUtils$FutureWaitable.waitFor(ProgressableUtils.java:312)
    at org.apache.giraph.utils.ProgressableUtils.waitFor(ProgressableUtils.java:185)
    ... 17 more
Caused by: java.lang.OutOfMemoryError: Java heap space
    at org.apache.giraph.edge.ByteArrayEdges.readFields(ByteArrayEdges.java:193)
    at org.apache.giraph.utils.WritableUtils.reinitializeVertexFromDataInput(WritableUtils.java:541)
    at org.apache.giraph.utils.VertexIterator.next(VertexIterator.java:98)
    at org.apache.giraph.partition.BasicPartition.addPartitionVertices(BasicPartition.java:99)
    at org.apache.giraph.comm.requests.SendWorkerVerticesRequest.doRequest(SendWorkerVerticesRequest.java:115)
    at org.apache.giraph.comm.netty.NettyWorkerClientRequestProcessor.doRequest(NettyWorkerClientRequestProcessor.java:466)
    at org.apache.giraph.comm.netty.NettyWorkerClientRequestProcessor.flush(NettyWorkerClientRequestProcessor.java:412)
    at org.apache.giraph.worker.InputSplitsCallable.call(InputSplitsCallable.java:241)
    at org.apache.giraph.worker.InputSplitsCallable.call(InputSplitsCallable.java:60)
    at org.apache.giraph.utils.LogStacktraceCallable.call(LogStacktraceCallable.java:51)
    ... 4 more

用于执行此操作的终端的行是:

$HADOOP_HOME/bin/yarn jar $GIRAPH_HOME/gaph-examples/target/giraph-examples-1.1.0-for-hadoop-2.4.0-jar-with-dependencies.jar algoritmos.masivos.BusquedaDeCaminosNavegacionalesWikiquotesMasivo lectura_de_grafo.BusquedaDeCaminosNavegacionalesWikiquote -vif pruebas.IdTextWithValueDoubleInputFormat -vip /user/hduser/input/wiki-graph-chiquito.txt -vof pruebas.IdTextWithValueTextOutputFormat -op /user/hduser/output/caminosNavegacionales -w 2 -yh 250

也许我应该使用EdgeInputFormat

感谢阅读。

1 个答案:

答案 0 :(得分:1)

我认为实际问题是分配给Maptask容器的内存不足导致Java堆空间错误。

要快速解决此问题,您可能更愿意通过在配置中分配更多内存来扩展纱线图/缩小节点的内存容器。

请为yarn-site.xml中的以下属性集分配更多内存。

mapreduce.map.memory.mb
mapreduce.reduce.memory.mb

mapreduce.map.java.opts
mapreduce.reduce.java.opts

[注意:* .memory.mb属性应该高于* .java.opts属性]