我必须使用Nutch 2.3.1设置hadoop堆栈。支持的hbase for hadoop 2.7.4版本是1.2.6,我已成功配置和测试。但是当我编译Nutch时,我得到了跟踪并抓取了一个示例页面,我收到了这个错误。
/usr/local/nutch/runtime/local/bin/nutch inject urls/ -crawlId kics
InjectorJob: starting at 2017-09-21 14:20:10
InjectorJob: Injecting urlDir: urls
Exception in thread "main" java.lang.NoSuchFieldError: HBASE_CLIENT_PREFETCH_LIMIT
at org.apache.hadoop.hbase.client.HConnectionKey.<clinit>(HConnectionKey.java:43)
at org.apache.hadoop.hbase.client.HConnectionManager.getConnection(HConnectionManager.java:267)
at org.apache.hadoop.hbase.client.HBaseAdmin.<init>(HBaseAdmin.java:194)
at org.apache.gora.hbase.store.HBaseStore.initialize(HBaseStore.java:115)
at org.apache.gora.store.DataStoreFactory.initializeDataStore(DataStoreFactory.java:102)
at org.apache.gora.store.DataStoreFactory.createDataStore(DataStoreFactory.java:161)
at org.apache.gora.store.DataStoreFactory.createDataStore(DataStoreFactory.java:135)
at org.apache.nutch.storage.StorageUtils.createWebStore(StorageUtils.java:78)
at org.apache.nutch.crawl.InjectorJob.run(InjectorJob.java:218)
at org.apache.nutch.crawl.InjectorJob.inject(InjectorJob.java:252)
at org.apache.nutch.crawl.InjectorJob.run(InjectorJob.java:275)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
at org.apache.nutch.crawl.InjectorJob.main(InjectorJob.java:284)
Error running:
根据我的搜索,例如this和this,可以为Nutch 2.3.1编译Hbase 1.x.但如何编译我不知道。有人可以指导(步骤等)
答案 0 :(得分:1)
Apache Gora 0.7支持HBase 1.2.3(+):https://issues.apache.org/jira/browse/GORA-443
你可以看看https://stackoverflow.com/a/39837926/582789我写的如何修改Nutch 2.3.1以使用Apache Gora 0.7。关于该答案中的补丁https://paste.apache.org/jjqz,请使用&#34; 0.7&#34;它显示&#34; 0.7-SNAPSHOT&#34;。
顺便说一下,Apache Gora 0.8昨天发布了:)只需将0.8改为0.8即可。
http://gora.apache.org/#20-september-2017-apache-gora-08-release