Hadoop 2.7群集运行良好,通过运行WordCount问题进行测试。
现在我正在尝试在我的hadoop群集上运行Apache Nutch爬虫。
第一步完成: -
a)下载Apache Nutch 1.11
b)根据需要更改conf文件 c)由Ant建造nutch d)运行以下命令运行apache-nutch-1.11.job: -
hadoop jar apache-nutch-1.11.job org.apache.nutch.crawl.Injector / user / hduser / nutchData / user / hduser / urls
给了我以下错误: -
ERROR crawl.Injector: Injector: java.lang.IllegalArgumentException: Wrong FS: hdfs://master:9000/user/hduser/nutchData/333860923, expected: file:///
at org.apache.hadoop.fs.FileSystem.checkPath(FileSystem.java:646)
at org.apache.hadoop.fs.RawLocalFileSystem.pathToFile(RawLocalFileSystem.java:82)
at org.apache.hadoop.fs.RawLocalFileSystem.deprecatedGetFileStatus(RawLocalFileSystem.java:604)
at org.apache.hadoop.fs.RawLocalFileSystem.getFileLinkStatusInternal(RawLocalFileSystem.java:822)
at org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:599)
at org.apache.hadoop.fs.FileSystem.isDirectory(FileSystem.java:1437)
at org.apache.hadoop.fs.ChecksumFileSystem.rename(ChecksumFileSystem.java:506)
at org.apache.nutch.crawl.CrawlDb.install(CrawlDb.java:169)
at org.apache.nutch.crawl.Injector.inject(Injector.java:354)
at org.apache.nutch.crawl.Injector.run(Injector.java:379)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
at org.apache.nutch.crawl.Injector.main(Injector.java:369)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
at org.apache.hadoop.util.RunJar.main(RunJar.java:136)