Nutch给出java.lang.UnsupportedOperationException:未由DistributedFileSystem FileSystem实现实现

时间:2016-05-19 05:53:20

标签: nutch

我在使用Nutch进行网页抓取时使用以下堆栈:

  • Hadoop:2.5.2
  • Hbase:0.98.12-hadoop2
  • Gora:0.6.1

但是当我通过这个命令注入url时:

hadoop@ubuntu:~$ nutch inject /home/gsingh/urls/seed.txt

我收到以下错误。

> InjectorJob: starting at 2016-05-19 11:12:57 InjectorJob: Injecting
> urlDir: /home/gsingh/urls/seed.txt InjectorJob:
> java.lang.UnsupportedOperationException: Not implemented by the
> DistributedFileSystem FileSystem implementation
>         at org.apache.hadoop.fs.FileSystem.getScheme(FileSystem.java:214)
>         at org.apache.hadoop.fs.FileSystem.loadFileSystems(FileSystem.java:2559)
>         at org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:2569)
>         at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2586)
>         at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:89)
>         at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2625)
>         at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2607)
>         at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:368)
>         at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:167)
>         at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:352)
>         at org.apache.hadoop.fs.Path.getFileSystem(Path.java:296)
>         at org.apache.hadoop.mapreduce.lib.input.FileInputFormat.addInputPath(FileInputFormat.java:372)
>         at org.apache.nutch.crawl.InjectorJob.run(InjectorJob.java:212)
>         at org.apache.nutch.crawl.InjectorJob.inject(InjectorJob.java:252)
>         at org.apache.nutch.crawl.InjectorJob.run(InjectorJob.java:275)
>         at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
>         at org.apache.nutch.crawl.InjectorJob.main(InjectorJob.java:284)

以下是类路径值:

hadoop@ubuntu:~$ echo $CLASSPATH
/usr/local/nutch/runtime/local/lib/*:.

任何人都知道如何纠正这个错误?

0 个答案:

没有答案