如何读取我在map任务中读取的reduce任务中的相同输入文件(Mapreduce)

时间:2018-04-14 16:15:32

标签: java hadoop mapreduce

我一直在尝试读取reducer中的文件,它也是mapper的输入。那么有什么方法可以在reducer中访问该文件吗?

2 个答案:

答案 0 :(得分:0)

MAPPER& REDUCER还有protected void setup(Context context) throws IOException, InterruptedException {}只是为了扩展和覆盖此方法,它可以帮助您首先读取您的文件。

顺便说一下,你也可以设置一个全局变量来读入mapper并在reducer中使用 最好使用job.addCacheArchive()方法

将文件添加到分布式缓存中

答案 1 :(得分:0)

驱动程序类

    public class MapReduceDriver extends Configured implements Tool
    {
        public static void main(String[] args) throws Exception 
        {
            int exitCode = ToolRunner.run(new Configuration(), new MapReduceDriver(), args);        
            System.exit(exitCode);
        }

        @Override
        public int run(String[] args) throws Exception 
        {
            Configuration conf = getConf();     
            conf.set("myPath", "/home/hdfs_path");
            ....... 
        }
    }

Mapper Class

    public class MapReduceMapper extends Mapper<LongWritable, Text, Text, Text>
    {
        public void map(LongWritable key, Text value, Context context)
        {           
            Configuration conf = context.getConfiguration();
            String myPathStr = conf.get("myPath");
            Path myPath = new Path(myPathStr);
            //code to read from the Path
        }
    }

减速

    public class MapReduceReducer extends Reducer<Text, Text, Text,Text> 
    {
        public void reduce(Text key, Iterable<Text> values, Context context)
        {
            Configuration conf = context.getConfiguration();
            String myPathStr = conf.get("myPath");
            Path myPath = new Path(myPathStr);
            //code to read from the path
        }
    }