我使用此代码来运行单词count hadoop job。当我使用hadoop eclipse插件从eclipse内部运行时,WordCountDriver运行。当我将mapper和reducer类打包为jar并将其放在类路径中时,WordCountDriver也会从命令行运行。

但是,如果我尝试从命令行运行它而没有将mapper和reducer类作为jar添加到类路径中,它会失败,尽管我将这两个类添加到类路径中。我想知道hadoop是否有一些限制来接受mapper& reducer类作为普通类文件。创建一个罐子总是强制性的吗?

public class WordCountDriver extends Configured implements Tool {

public static final String HADOOP_ROOT_DIR = "hdfs://universe:54310/app/hadoop/tmp";

static class WordCountMapper extends Mapper<LongWritable, Text, Text, IntWritable> {

    private Text word = new Text();
    private final IntWritable one = new IntWritable(1);

    public void map(LongWritable key, Text value, Context context) throws IOException, InterruptedException {

        String line = value.toString();
        StringTokenizer itr = new StringTokenizer(line.toLowerCase());
        while (itr.hasMoreTokens()) {
            context.write(word, one);

static class WordCountReducer extends Reducer<Text, IntWritable, Text, IntWritable> {

    public void reduce(Text key, Iterable<IntWritable> values, Context context) throws IOException, InterruptedException {

        int sum = 0;

        for (IntWritable value : values) {
            sum += value.get(); // process value
        context.write(key, new IntWritable(sum));

public int run(String[] args) throws Exception {

    Configuration conf = getConf();

    conf.set("mapred.job.tracker", "universe:54311");

    Job job = new Job(conf, "Word Count");

    // specify output types

    // specify input and output dirs
    FileInputFormat.addInputPath(job, new Path(HADOOP_ROOT_DIR + "/input"));
    FileOutputFormat.setOutputPath(job, new Path(HADOOP_ROOT_DIR + "/output"));

    // specify a mapper

    // specify a reducer


    return job.waitForCompletion(true) ? 0 : 1;

 * @param args
 * @throws Exception
public static void main(String[] args) throws Exception {
    int res = ToolRunner.run(new Configuration(), new WordCountDriver(), args);

您所指的是哪个类路径并不完全清楚,但最后,如果您在远程 Hadoop集群上运行,则需要提供JAR文件中的所有类。在hadoop jar执行期间发送给Hadoop。本地程序的类路径无关紧要。


