Nutch 2.2.1 + hBase

时间:2013-07-04 14:23:14

标签: java hbase nutch gora

我正在尝试运行新版本的Apache Nutch进行抓取。当我启动脚本/ bin / crawl时,它失败并且hadoop.log说:

java.lang.Exception:java.lang.NoSuchMethodError:org.apache.gora.persistency.Persistent.getSchema()Lorg / apache / avro / Schema;     在org.apache.hadoop.mapred.LocalJobRunner $ Job.run(LocalJobRunner.java:354)     引起:java.lang.NoSuchMethodError:org.apache.gora.persistency.Persistent.getSchema()Lorg / apache / avro / Schema;     在org.apache.gora.hbase.store.HBaseStore.put(HBaseStore.java:177)

这是日志:

2013-07-04 16:12:05,069 WARN  mapred.LocalJobRunner - job_local1522971864_0001
java.lang.Exception: java.lang.NoSuchMethodError:     org.apache.gora.persistency.Persistent.getSchema()Lorg/apache/avro/Schema;
at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:354)
Caused by: java.lang.NoSuchMethodError:     org.apache.gora.persistency.Persistent.getSchema()Lorg/apache/avro/Schema;
at org.apache.gora.hbase.store.HBaseStore.put(HBaseStore.java:177)
at org.apache.gora.mapreduce.GoraRecordWriter.write(GoraRecordWriter.java:65)
at org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector.write(MapTask.java:638)
at org.apache.hadoop.mapreduce.TaskInputOutputContext.write(TaskInputOutputContext.java:80)
at org.apache.nutch.crawl.InjectorJob$UrlMapper.map(InjectorJob.java:191)
at org.apache.nutch.crawl.InjectorJob$UrlMapper.map(InjectorJob.java:88)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:364)
at org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:223)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
at java.util.concurrent.FutureTask.run(FutureTask.java:166)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:722)

2013-07-04 16:12:05,720 ERROR crawl.InjectorJob - InjectorJob: java.lang.RuntimeException: job failed: name=[new]inject /opt/ir/nutch2/urls, jobid=job_local1522971864_0001
at org.apache.nutch.util.NutchJob.waitForCompletion(NutchJob.java:54)
at org.apache.nutch.crawl.InjectorJob.run(InjectorJob.java:233)
at org.apache.nutch.crawl.InjectorJob.inject(InjectorJob.java:251)
at org.apache.nutch.crawl.InjectorJob.run(InjectorJob.java:273)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
at org.apache.nutch.crawl.InjectorJob.main(InjectorJob.java:282)

我应该在ivy.xml中设置一些gora工件吗?请帮帮我。

1 个答案:

答案 0 :(得分:0)

解决。您必须向库中添加正确版本的gora-hbase。山-HBase的-0.3.jar