使用Hadoop与Cygwin接收“错误:无法找到或加载主类工作\ work”

时间:2013-11-26 21:02:57

标签: java hadoop cygwin

我在Windows 7上使用Cygwin尝试在Hadoop 1.2.1中设置单个节点。我关注this tutorial。我能够创建输入目录,并将.xml文件复制到输入目录。麻烦似乎是当我执行$ bin/hadoop jar hadoop-examples-*.jar grep input output 'dfs [a-z.]+'它抛出“错误:无法在命令行中找到或加载主类work \ work”。我检查了源代码(下面列出,看起来像Python),并显示了一个主要方法。我还尝试了原始命令行调用的变体,例如$ bin/hadoop jar hadoop-examples-*.jar main input output 'dfs [a-z.]+'等。

我的问题是:为什么hadoop没有阅读这个主要方法?我如何让它阅读这个主要方法?什么是cygwin告诉我它什么时候说“工作/工作?”源是用Python编写并编译成.jar格式有什么意义吗?

from org.apache.hadoop.fs import Path
from org.apache.hadoop.io import *
from org.apache.hadoop.mapred import *

import sys
import getopt

class WordCountMap(Mapper, MapReduceBase):
    one = IntWritable(1)
    def map(self, key, value, output, reporter):
        for w in value.toString().split():
            output.collect(Text(w), self.one)

class Summer(Reducer, MapReduceBase):
   def reduce(self, key, values, output, reporter):
        sum = 0
            while values.hasNext():
            sum += values.next().get()
        output.collect(key, IntWritable(sum))

def printUsage(code):
    print "wordcount [-m <maps>] [-r <reduces>] <input> <output>"
    sys.exit(code)

def main(args):
    conf = JobConf(WordCountMap);
    conf.setJobName("wordcount");

    conf.setOutputKeyClass(Text);
    conf.setOutputValueClass(IntWritable);

    conf.setMapperClass(WordCountMap);        
    conf.setCombinerClass(Summer);
    conf.setReducerClass(Summer);
    try:
        flags, other_args = getopt.getopt(args[1:], "m:r:")
    except getopt.GetoptError:
        printUsage(1)
    if len(other_args) != 2:
        printUsage(1)

    for f,v in flags:
        if f == "-m":
            conf.setNumMapTasks(int(v))
        elif f == "-r":
            conf.setNumReduceTasks(int(v))
    conf.setInputPath(Path(other_args[0]))
    conf.setOutputPath(Path(other_args[1]))
    JobClient.runJob(conf);

if __name__ == "__main__":
    main(sys.argv)

0 个答案:

没有答案