我想使用命令bin/nutch inject
来注入我的抓取网址,但是我收到错误
'nutch' is not recognized as an internal or external command,
operable program or batch file.
我在哪里输入此命令?我目前正在命令提示符的路径C:\Users\Gaurav Kandpal\Desktop\elastic\apache-nutch-2.3-src\apache-nutch-2.3\runtime\local\b
上键入此命令。
答案 0 :(得分:3)
按照步骤在Windows 中安装
nutch
:
1) download and install cygwin from : https://www.cygwin.com/
2) download nutch from : http://nutch.apache.org/downloads.html
3) paste nutch downloaded and extracted file into C:\cygwin64\home\
4) open cygwin terminal and type given commands
- $ cd C:
- $ cd cigwin64
- $ cd home
- $ cd apache-nutch
- $ cd src/bin
- $ ./nutch
您将获得输出:
Usage: nutch COMMAND
where COMMAND is one of:
inject inject new urls into the database
hostinject creates or updates an existing host table from a text file
generate generate new batches to fetch from crawl db
fetch fetch URLs marked during generate
parse parse URLs marked during fetch
.
.
.
.
答案 1 :(得分:0)
首先,请检查您是否编译了Nutch源代码。然后,在集群中部署Nutch的情况下,您应该尝试从here / path /到/ nutch / runtime / deploy / bin运行。