Question

我正在尝试打开一个文件，将文件中读取的一些参数传递给作业MapReduce。此代码在本地模式下工作，但是当我尝试攻击HDFS时它不起作用。

这是我的代码：

Path tmpPath = new Path(tmpFile);
    try {
        InputStream ips = new FileInputStream(tmpFile);
        InputStreamReader ipsr = new InputStreamReader(ips);
        BufferedReader br = new BufferedReader(ipsr);

        String[] minMax = br.readLine().split("-");
        min = minMax[0];
        max = minMax[1];
        br.close();
    } catch (Exception e) {
        System.out.println(e.toString());
        System.exit(-1);
    }

这是出现的代码错误：

“java.io.FileNotFoundException：hdfs：/quickstart.cloudera：8020 / user / cloudera / dataOut / tmp / part-r-00000（没有这样的文件或目录）”

这是我在上一份工作中写文件的地方：

    Path tmp = new Path("dataOut/tmp");
    FileOutputFormat.setOutputPath(job, tmp);

作为MapReduce作业，这将写入文件part-r-00000。

可能你们所有人都会说，“尝试使用分布式缓存”。我已经尝试过，但我是Java，Hadoop和MapReduce的新手。我无法让它发挥作用......

由于

Answer 1

查看您的错误代码 “java.io.FileNotFoundException：hdfs：/quickstart.cloudera：8020 / user / cloudera / dataOut / tmp / part-r-00000（没有这样的文件或目录）”

您的输出路径似乎不在给定目录中。尝试运行以下命令以检查是否能够超出路径。

hadoop fs -text hdfs：/quickstart.cloudera：8020 / user / cloudera / dataOut / tmp / part-r-00000

Answer 2

我终于明白了。我用了这段代码：

    Configuration conf = new Configuration();
    Path file = new Path(DEFAULT_FS + "/user/cloudera/dataOut/tmp/part-r-00000");
    FileSystem hdfs = FileSystem.get(file.toUri(), conf);
    FSDataInputStream in = hdfs.open(file);
    byte[] content = new byte[(int) hdfs.getFileStatus(file).getLen()];
    in.readFully(content);
    String maxMin = new String(content);

Hadoop MapReduce。无法打开文件以传递参数

2 个答案: