以下是Mappers的代码,从HDFS读取文本文件对吗?如果是:
InputStreamReader
?如果是这样,如何在不关闭文件系统的情况下执行此操作?我的代码是:
Path pt=new Path("hdfs://pathTofile");
FileSystem fs = FileSystem.get(context.getConfiguration());
BufferedReader br=new BufferedReader(new InputStreamReader(fs.open(pt)));
String line;
line=br.readLine();
while (line != null){
System.out.println(line);
答案 0 :(得分:19)
这将有效,并进行一些修改 - 我假设您粘贴的代码只是被截断:
Path pt=new Path("hdfs://pathTofile");
FileSystem fs = FileSystem.get(context.getConfiguration());
BufferedReader br=new BufferedReader(new InputStreamReader(fs.open(pt)));
try {
String line;
line=br.readLine();
while (line != null){
System.out.println(line);
// be sure to read the next line otherwise you'll get an infinite loop
line = br.readLine();
}
} finally {
// you should close out the BufferedReader
br.close();
}
您可以让多个映射器读取同一个文件,但是使用分布式缓存更有意义(不仅减少了承载文件块的数据节点的负载,而且还会如果你的任务数量大于任务节点,那么效率会更高。
答案 1 :(得分:0)
import java.io.{BufferedReader, InputStreamReader}
def load_file(path:String)={
val pt=new Path(path)
val fs = FileSystem.get(new Configuration())
val br=new BufferedReader(new InputStreamReader(fs.open(pt)))
var res:List[String]= List()
try {
var line=br.readLine()
while (line != null){
System.out.println(line);
res= res :+ line
line=br.readLine()
}
} finally {
// you should close out the BufferedReader
br.close();
}
res
}