尝试在Hadoop上运行Mahout测试分类器时“超出了GC开销限制”

时间:2014-03-06 14:17:37

标签: hadoop mahout

我在Linux上使用Hadoop版本0.20.2

我正在尝试使用以下命令测试分类器模型:

bin/hadoop jar /usr/local/mahout/examples/target/mahout-examples-0.6-job.jar \
org.apache.mahout.classifier.bayes.TestClassifier -m wikipediamodel -d wikipediainput

但是我收到以下错误:

    14/03/06 08:57:36 INFO common.HadoopUtil: Deleting wikipediainput-output
    14/03/06 08:58:32 WARN mapred.JobClient: Use GenericOptionsParser for parsing the arguments. Applications should implement Tool for the same.
    14/03/06 08:58:33 INFO mapred.FileInputFormat: Total input paths to process : 2
    14/03/06 08:58:34 INFO mapred.JobClient: Running job: job_201403060857_0001
    14/03/06 08:58:35 INFO mapred.JobClient:  map 0% reduce 0%
    14/03/06 09:06:27 INFO mapred.JobClient: Task Id : attempt_201403060857_0001_m_000000_0, Status : FAILED
    java.lang.RuntimeException: Error in configuring object
        at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:93)
        at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:64)
        at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:117)
        at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:354)
        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:307)
        at org.apache.hadoop.mapred.Child.main(Child.java:170)
    Caused by: java.lang.reflect.InvocationTargetException
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:606)
        at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:88)
        ... 5 more
    Caused by: java.lang.RuntimeException: Error in configuring object
        at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:93)
        at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:64)
        at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:117)
        at org.apache.hadoop.mapred.MapRunner.configure(MapRunner.java:34)
        ... 10 more
    Caused by: java.lang.reflect.InvocationTargetException
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:606)
        at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:88)
        ... 13 more
    Caused by: java.lang.OutOfMemoryError: GC overhead limit exceeded
        at java.nio.ByteBuffer.wrap(ByteBuffer.java:369)
        at org.apache.hadoop.io.Text.decode(Text.java:327)
        at org.apache.hadoop.io.Text.toString(Text.java:254)
        at org.apache.mahout.common.StringTuple.readFields(StringTuple.java:143)
        at org.apache.hadoop.io.SequenceFile$Reader.next(SequenceFile.java:1836)
        at org.apache.hadoop.io.SequenceFile$Reader.next(SequenceFile.java:1876)
        at org.apache.mahout.common.iterator.sequencefile.SequenceFileIterator.computeNext(SequenceFileIterator.java:95)
        at org.apache.mahout.common.iterator.sequencefile.SequenceFileIterator.computeNext(SequenceFileIterator.java:38)
        at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:141)
        at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:136)
        at com.google.common.collect.Iterators$5.hasNext(Iterators.java:525)
        at com.google.common.collect.ForwardingIterator.hasNext(ForwardingIterator.java:43)
        at org.apache.mahout.classifier.bayes.SequenceFileModelReader.loadFeatureWeights(SequenceFileModelReader.java:72)
        at org.apache.mahout.classifier.bayes.SequenceFileModelReader.loadModel(SequenceFileModelReader.java:46)
        at org.apache.mahout.classifier.bayes.InMemoryBayesDatastore.initialize(InMemoryBayesDatastore.java:72)
        at org.apache.mahout.classifier.bayes.ClassifierContext.initialize(ClassifierContext.java:44)
        at org.apache.mahout.classifier.bayes.mapreduce.bayes.BayesClassifierMapper.configure(BayesClassifierMapper.java:120)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:606)
        at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:88)
        at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:64)
        at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:117)
        at org.apache.hadoop.mapred.MapRunner.configure(MapRunner.java:34)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:606)
        at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:88)
        at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:64)
        at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:117)

    14/03/06 09:06:43 INFO mapred.JobClient: Task Id : attempt_201403060857_0001_m_000001_0, Status : FAILED
    java.lang.RuntimeException: Error in configuring object
        at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:93)
        at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:64)
        at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:117)
        at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:354)
        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:307)
        at org.apache.hadoop.mapred.Child.main(Child.java:170)
    Caused by: java.lang.reflect.InvocationTargetException
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:606)
        at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:88)
        ... 5 more
    Caused by: java.lang.RuntimeException: Error in configuring object
        at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:93)
        at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:64)
        at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:117)
        at org.apache.hadoop.mapred.MapRunner.configure(MapRunner.java:34)
        ... 10 more
    Caused by: java.lang.reflect.InvocationTargetException
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:606)
        at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:88)
        ... 13 more
    Caused by: java.lang.OutOfMemoryError: GC overhead limit exceeded
        at java.util.Arrays.copyOfRange(Arrays.java:2694)
        at java.lang.String.<init>(String.java:203)
        at java.nio.HeapCharBuffer.toString(HeapCharBuffer.java:561)
        at java.nio.CharBuffer.toString(CharBuffer.java:1201)
        at org.apache.hadoop.io.Text.decode(Text.java:350)
        at org.apache.hadoop.io.Text.decode(Text.java:327)
        at org.apache.hadoop.io.Text.toString(Text.java:254)
        at org.apache.mahout.common.StringTuple.readFields(StringTuple.java:143)
        at org.apache.hadoop.io.SequenceFile$Reader.next(SequenceFile.java:1836)
        at org.apache.hadoop.io.SequenceFile$Reader.next(SequenceFile.java:1876)
        at org.apache.mahout.common.iterator.sequencefile.SequenceFileIterator.computeNext(SequenceFileIterator.java:95)
        at org.apache.mahout.common.iterator.sequencefile.SequenceFileIterator.computeNext(SequenceFileIterator.java:38)
        at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:141)
        at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:136)
        at com.google.common.collect.Iterators$5.hasNext(Iterators.java:525)
        at com.google.common.collect.ForwardingIterator.hasNext(ForwardingIterator.java:43)
        at org.apache.mahout.classifier.bayes.SequenceFileModelReader.loadFeatureWeights(SequenceFileModelReader.java:72)
        at org.apache.mahout.classifier.bayes.SequenceFileModelReader.loadModel(SequenceFileModelReader.java:46)
        at org.apache.mahout.classifier.bayes.InMemoryBayesDatastore.initialize(InMemoryBayesDatastore.java:72)
        at org.apache.mahout.classifier.bayes.ClassifierContext.initialize(ClassifierContext.java:44)
        at org.apache.mahout.classifier.bayes.mapreduce.bayes.BayesClassifierMapper.configure(BayesClassifierMapper.java:120)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:606)
        at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:88)
        at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:64)
        at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:117)
        at org.apache.hadoop.mapred.MapRunner.configure(MapRunner.java:34)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)

可以做些什么呢?

2 个答案:

答案 0 :(得分:1)

对于原始问题,请切换到使用Mahout 0.9-很多已经从Mahout 0.6更改,不再支持0.8之前的版本。

答案 1 :(得分:0)

Caused by: java.lang.OutOfMemoryError: GC overhead limit exceeded表示您必须调整已映射的子项的内存设置,例如:

<property>
    <name>mapred.child.java.opts</name>
    <value>-Xmx1024m</value>
</property>

hadoop/conf/mapred-site.xml中,或者更好的是,在你的工作配置中。

您可能需要稍微玩一下才能找出有效的方法。为孩子保留太多记忆也行不通。