Question

当我在cygwin中输入以下命令时：

bin/nutch index crawl/crawldb crawl/linkdb crawl/segment/*

然后二进制工作正常。当我将完全相同的行放入我的bash脚本时：

#!/bin/bash/
bin/nutch index crawl/crawldb crawl/linkdb crawl/segment/*

我收到错误消息，说某些文件不存在。这可能是Nutch特有的，这是我正在运行的程序，但我认为它更多地与我在脚本中调用命令的方式有关。关于什么是错的以及如何解决这个问题的想法？（是的，我正在使用标签完成）

编辑：

脚本：

#!/bin/bash
/home/Dan/apache-nutch-1.2/bin/nutch index crawl/indexes crawl/crawldb crawl/linkdb crawl/segments/*

我运行命令：

$ pwd
/home/Dan/apache-nutch-1.2
$ ./nutch.sh

我得到的输出是：

Indexer: starting at 2010-11-29 15:15:44
Indexer: org.apache.hadoop.mapred.InvalidInputException: Input path does not exist: file:/C:/cygwin/home/Dan/apache-nutch-1.2/
/crawl_fetch
Input path does not exist: file:/C:/cygwin/home/Dan/apache-nutch-1.2/
/crawl_parse
Input path does not exist: file:/C:/cygwin/home/Dan/apache-nutch-1.2/
/parse_data
Input path does not exist: file:/C:/cygwin/home/Dan/apache-nutch-1.2/
/parse_text
    at org.apache.hadoop.mapred.FileInputFormat.listStatus(FileInputFormat.java:190)
    at org.apache.hadoop.mapred.SequenceFileInputFormat.listStatus(SequenceFileInputFormat.java:44)
    at org.apache.hadoop.mapred.FileInputFormat.getSplits(FileInputFormat.java:201)
    at org.apache.hadoop.mapred.JobClient.writeOldSplits(JobClient.java:810)
    at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:781)
    at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:730)
    at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1249)
    at org.apache.nutch.indexer.Indexer.index(Indexer.java:76)
    at org.apache.nutch.indexer.Indexer.run(Indexer.java:97)
    at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
    at org.apache.nutch.indexer.Indexer.main(Indexer.java:106)

此致〜DS

Answer 1

两件事：

你在剧本开头的shebang中的“bash”后面有一个尾随斜线 - 删除它，它应该只是阅读#!/bin/bash。另请仔细检查bash中是/bin。
该脚本将尝试从您当前文件夹中的bin目录执行nutch。所以，如果你在$HOME，并假设你有一条路径$HOME/bin/nutch，那么你会没事的。但是如果你改为/tmp，那么它就会失败，因为没有/tmp/bin/nutch这样的路径。你最好先把完整的绝对路径名给nutch。

Bash脚本命令问题

1 个答案: