Question

我对Map / Reduce和Hadoop相当新。我正在尝试编写一个wordccount mapreduce程序，我正在使用eclipse在本地运行它。我已经指定了输入文件的路径以及输出目录。当我编译程序时，它抛出一个IO Exception“系统找不到指定的文件。”

我的代码看起来像这样

import java.io.IOException;
import java.util.Iterator;
import java.util.StringTokenizer;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.LongWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapred.FileInputFormat;
import org.apache.hadoop.mapred.FileOutputFormat;
import org.apache.hadoop.mapred.JobClient;
import org.apache.hadoop.mapred.JobConf;
import org.apache.hadoop.mapred.MapReduceBase;
import org.apache.hadoop.mapred.Mapper;
import org.apache.hadoop.mapred.OutputCollector;
import org.apache.hadoop.mapred.Reducer;
import org.apache.hadoop.mapred.Reporter;
import org.apache.hadoop.mapreduce.Job;

public class WordCount {
public static class Wordcountmapper extends MapReduceBase implements Mapper<LongWritable, Text, Text, IntWritable>{
    private final static IntWritable one = new IntWritable(1);
    private Text word=new Text();
    @Override
    public void map(LongWritable key, Text value, OutputCollector<Text, IntWritable> output,
            Reporter reporter) throws IOException {

        String line = value.toString();
        System.out.println("Line " + line);
        StringTokenizer token = new StringTokenizer(line,",");

        while(token.hasMoreTokens())

        {
            word.set(token.nextToken());
            output.collect(word, one);
            System.out.println("hii  " + word + " " + one);
        }   
    }   
}
public static class wordcountreducer extends MapReduceBase implements Reducer<Text, IntWritable, Text, IntWritable> {
    public void reduce(Text key, Iterator<IntWritable> value,
            OutputCollector<Text, IntWritable> output, Reporter reporter)
            throws IOException {
        int sum = 0 ;
        System.out.println("Inside Reducer" + value.hasNext());
        System.out.println("Key = " + key );
        while(value.hasNext()) {
            sum += value.next().get();
        }
        output.collect(key,new IntWritable(sum));
        System.out.println(key + " " + sum);
    }
}
public static void main(String[] args) throws IOException {
    JobConf conf = new JobConf(WordCount.class);
    conf.setOutputKeyClass(Text.class);
    conf.setOutputValueClass(IntWritable.class);
    conf.setMapperClass(Wordcountmapper.class);
    conf.setReducerClass(wordcountreducer.class);
    FileInputFormat.addInputPath(conf, new Path("C:\\WordCount\\wordcount.txt"));
    String outputfile = "C:\\WordCount\\Output\\a.txt";
    FileOutputFormat.setOutputPath(conf, new Path(outputfile));
    JobClient.runJob(conf);
}
}

我在这里遗漏了什么吗？我正在使用带有hadoop插件的Eclipse Juno来运行map reduce程序。

抛出的错误如下

15/01/22 17:54:43 INFO Configuration.deprecation: mapred.output.key.class is deprecated. Instead, use mapreduce.job.output.key.class
15/01/22 17:54:43 INFO Configuration.deprecation: mapred.output.value.class is deprecated. Instead, use mapreduce.job.output.value.class
15/01/22 17:54:43 INFO Configuration.deprecation: mapred.working.dir is deprecated. Instead, use mapreduce.job.working.dir
15/01/22 17:54:43 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
15/01/22 17:54:43 INFO Configuration.deprecation: mapred.input.dir is deprecated. Instead, use mapreduce.input.fileinputformat.inputdir
15/01/22 17:54:43 INFO Configuration.deprecation: mapred.output.dir is deprecated. Instead, use mapreduce.output.fileoutputformat.outputdir
15/01/22 17:54:43 INFO Configuration.deprecation: mapred.job.tracker is deprecated. Instead, use mapreduce.jobtracker.address
15/01/22 17:54:43 INFO Configuration.deprecation: mapred.map.tasks is deprecated. Instead, use mapreduce.job.maps
15/01/22 17:54:43 INFO Configuration.deprecation: session.id is deprecated. Instead, use dfs.metrics.session-id
15/01/22 17:54:43 INFO Configuration.deprecation: mapred.job.queue.name is deprecated. Instead, use mapreduce.job.queuename
15/01/22 17:54:43 ERROR security.UserGroupInformation: PriviledgedActionException as:Soorya S (auth:SIMPLE) cause:java.io.IOException: Cannot run program "chmod": CreateProcess error=2, The system cannot find the file specified.
Exception in thread "main" java.io.IOException: Cannot run program "chmod": CreateProcess error=2, The system cannot find the file specified.
    at java.lang.ProcessBuilder.start(ProcessBuilder.java:471)
    at org.apache.hadoop.util.Shell.runCommand(Shell.java:384)
    at org.apache.hadoop.util.Shell.run(Shell.java:359)
    at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:569)
    at org.apache.hadoop.util.Shell.execCommand(Shell.java:658)
    at org.apache.hadoop.util.Shell.execCommand(Shell.java:641)
    at org.apache.hadoop.fs.RawLocalFileSystem.setPermission(RawLocalFileSystem.java:639)
    at org.apache.hadoop.fs.RawLocalFileSystem.mkdirs(RawLocalFileSystem.java:435)
    at org.apache.hadoop.fs.FilterFileSystem.mkdirs(FilterFileSystem.java:277)
    at org.apache.hadoop.mapreduce.JobSubmissionFiles.getStagingDir(JobSubmissionFiles.java:122)
    at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:969)
    at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:963)
    at java.security.AccessController.doPrivileged(AccessController.java:284)
    at javax.security.auth.Subject.doAs(Subject.java:573)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1502)
    at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:963)
    at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:937)
    at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1375)
    at com.ibm.hadoop.WordCount.main(WordCount.java:63)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:60)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:37)
    at java.lang.reflect.Method.invoke(Method.java:611)
    at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
Caused by: java.io.IOException: CreateProcess error=2, The system cannot find the file specified.
    at java.lang.ProcessImpl.create(Native Method)
    at java.lang.ProcessImpl.<init>(ProcessImpl.java:92)
    at java.lang.ProcessImpl.start(ProcessImpl.java:41)
    at java.lang.ProcessBuilder.start(ProcessBuilder.java:464)
    ... 23 more
getting access token
  [getToken] got user access token

getting primary group
  [getPrimaryGroup] Got TokenPrimaryGroup info
  [getPrimaryGroup] primaryGroup: S-1-5-21-184784153-2138975554-913327727-513
getting supplementary groups
  [getGroups] Got TokenGroups info
  [getGroups] group 0: S-1-5-21-184784153-2138975554-913327727-513
  [getGroups] group 1: S-1-1-0
  [getGroups] group 2: S-1-5-114
  [getGroups] group 3: S-1-5-32-544
  [getGroups] group 4: S-1-5-32-545
  [getGroups] group 5: S-1-5-4
  [getGroups] group 6: S-1-2-1
  [getGroups] group 7: S-1-5-11
  [getGroups] group 8: S-1-5-15
  [getGroups] group 9: S-1-5-113
  [getGroups] group 10: S-1-5-5-0-576278
  [getGroups] group 11: S-1-2-0
  [getGroups] group 12: S-1-5-64-10
  [getGroups] group 13:S-1-16-8192

Answer 1

我有同样的问题（我假设你正在使用Windows）。这是因为需要设置指向'chmod'可执行文件的系统路径变量。您可以按照此处running hadoop on window-7 64 bit

中所述的步骤进行操作

Answer 2

由于Hadoop对Unix工具chmod进行系统调用，我遇到了以下异常：

Exception in thread "main" java.io.IOException: Cannot run program "chmod": CreateProcess error=2,
The system cannot find the file specified :
        at java.lang.ProcessBuilder.start(Unknown Source)
        at org.apache.hadoop.util.Shell.runCommand(Shell.java:201)
        at org.apache.hadoop.util.Shell.run(Shell.java:183)
        at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:376)
        at org.apache.hadoop.util.Shell.execCommand(Shell.java:462)
        at org.apache.hadoop.util.Shell.execCommand(Shell.java:445)
        at org.apache.hadoop.fs.RawLocalFileSystem.execCommand(RawLocalFileSystem.java:543)
        at org.apache.hadoop.fs.RawLocalFileSystem.setPermission(RawLocalFileSystem.java:535)
        at org.apache.hadoop.fs.FilterFileSystem.setPermission(FilterFileSystem.java:336)
        at org.apache.hadoop.fs.ChecksumFileSystem.create(ChecksumFileSystem.java:400)
        at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:610)
        at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:591)
        at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:498)
        at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:490)
        at org.apache.hadoop.hbase.io.hfile.HFile$Writer.<init>(HFile.java:306)
        at org.expasy.jpl.io.util.JPLHMapSerializer.init(JPLHMapSerializer.java:125)

您可以尝试使用波纹管

修复依赖错误解决方案是在Windows系统中安装cygwin或安装cygwin的子集，因为仅需要chmod及其dll。下面，我们将提供第二种选择的解决方案：

第一步：获取“ chmod”资源以下是不同Windows体系结构的档案：

Windows 32位-包含chmod.exe，cygwin1.dll，cygiconv-2.dll，cygintl-8.dll和cyggcc_s-1.dll Windows 64位-尚不可用第二步：在Windows中设置路径不要忘记在Windows中为chmod设置PATH变量，否则将找不到chmod！

首先右键单击桌面上的“我的电脑”图标，然后单击“属性”。或者，您可以只按Windows键+暂停中断键然后在打开的新窗口中，单击“高级”选项卡单击环境变量在系统变量中，编辑或创建PATH变量，然后输入cygwin-chmod目录的路径名

Answer 3

安装cygwin并将其添加到您的路径环境

Answer 4

要解决错误：线程“main”中的异常java.io.IOException：无法使用java hadoop程序运行程序“chmod”，请尝试以管理员身份运行eclipse。对我来说已经奏效了。

IO异常＆＃34;系统无法找到指定的文件＆＃34;在WordCount Mapreduce程序中

4 个答案: