使用Hadoop的DistributedCache读取文件时出错

时间:2019-10-17 11:50:11

标签: java hadoop

我想使用DistributedCache使用Hadoop的映射器读取文件,并且错误发生在第十行之一:

public class CacheMapper extends Mapper<Object, Text, Text, IntWritable> {

    @Override
    public void setup(Context context) throws IOException, InterruptedException {
        try {
            org.apache.hadoop.fs.Path[] cacheFiles = DistributedCache.getLocalCacheFiles(context.getConfiguration());
            String filename = cacheFiles[0].toString(); // dictionary.txt
            java.nio.file.Path path = Paths.get(filename);
            List<String> list = Files.readAllLines(path); // ERROR!!!
            // ...
        } catch // ...
            // ...
    }

    @Override
    public void map( // ...
    // ...
}

在主班:

String filename = "dictionary.txt";
DistributedCache.addCacheFile(new org.apache.hadoop.fs.Path(filename).toUri(), job.getConfiguration());

文件dictionary.txt在工作区中,并且还放置在hadoop fs中:

hadoop fs -put dictionary.txt dictionary.txt
hadoop fs -ls
  

-rwxrwxrwx 1根根163 2019-10-17 07:35 dictionary.txt

0 个答案:

没有答案