我想使用命令" bin / nutch inject"注入我的抓取网址

时间:2015-12-08 15:40:43

标签: nutch

我想使用命令bin/nutch inject来注入我的抓取网址,但是我收到错误

'nutch' is not recognized as an internal or external command,
operable program or batch file.

我在哪里输入此命令?我目前正在命令提示符的路径C:\Users\Gaurav Kandpal\Desktop\elastic\apache-nutch-2.3-src\apache-nutch-2.3\runtime\local\b上键入此命令。

2 个答案:

答案 0 :(得分:3)

  

按照步骤在Windows 中安装nutch

1) download and install cygwin from : https://www.cygwin.com/
2) download nutch from : http://nutch.apache.org/downloads.html
3) paste nutch downloaded and extracted file into C:\cygwin64\home\
4) open cygwin terminal and type given commands 

 - $ cd C:
 - $ cd cigwin64
 - $ cd home
 - $ cd apache-nutch
 - $ cd src/bin
 - $ ./nutch
  

您将获得输出

Usage: nutch COMMAND
where COMMAND is one of:
 inject         inject new urls into the database
 hostinject     creates or updates an existing host table from a text file
 generate       generate new batches to fetch from crawl db
 fetch          fetch URLs marked during generate
 parse          parse URLs marked during fetch
.
.
.
.

答案 1 :(得分:0)

首先,请检查您是否编译了Nutch源代码。然后,在集群中部署Nutch的情况下,您应该尝试从here / path /到/ nutch / runtime / deploy / bin运行。