我在unix环境中设置了JAVA_HOME。问题看起来像是Class.I的路径我不知道......
当我执行此commad行时:
ahmed @ ubuntu:〜/ apache-nutch-1.9 / bin $ ./nutch bin / Crawl
我有这个例外:
线程“main”中的异常java.lang.NoClassDefFoundError:bin / Crawl 引起:java.lang.ClassNotFoundException:bin.Crawl 在java.net.URLClassLoader $ 1.run(URLClassLoader.java:217) at java.security.AccessController.doPrivileged(Native Method) 在java.net.URLClassLoader.findClass(URLClassLoader.java:205) at java.lang.ClassLoader.loadClass(ClassLoader.java:323) 在sun.misc.Launcher $ AppClassLoader.loadClass(Launcher.java:294) at java.lang.ClassLoader.loadClass(ClassLoader.java:268) 找不到主类:bin / Crawl。程序将退出。
我可以得到一个答案。
答案 0 :(得分:1)
它不存在名为'bin / Crawl'的命令。
如果执行./bin/nutch
,则会获得命令列表:
Usage: nutch COMMAND
where COMMAND is one of:
inject inject new urls into the database
hostinject creates or updates an existing host table from a text file
generate generate new batches to fetch from crawl db
fetch fetch URLs marked during generate
parse parse URLs marked during fetch
updatedb update web table after parsing
updatehostdb update host table after parsing
readdb read/dump records from page database
readhostdb display entries from the hostDB
index run the plugin-based indexer on parsed batches
elasticindex run the elasticsearch indexer - DEPRECATED use the index command instead
solrindex run the solr indexer on parsed batches - DEPRECATED use the index command instead
solrdedup remove duplicates from solr
solrclean remove HTTP 301 and 404 documents from solr - DEPRECATED use the clean command instead
clean remove HTTP 301 and 404 documents and duplicates from indexing backends configured via plugins
parsechecker check the parser for a given url
indexchecker check the indexing filters for a given url
plugin load a plugin and run one of its classes main()
nutchserver run a (local) Nutch server on a user defined port
webapp run a local Nutch web application
junit runs the given JUnit test
or
CLASSNAME run the class named CLASSNAME
Most commands print help when invoked w/o parameters.
由于'bin / Crawl'命令不存在,它假定它是CLASSNAME,因此是错误。
过去现有./bin/nutch crawl
(已弃用),但现在有一个特定的抓取脚本。使用此:
./bin/crawl