Hadoop中的DistributedCache MapReduce-Null指针执行

时间:2015-10-26 12:14:33

标签: hadoop mapreduce distributed-caching

我使用分布式缓存来存储文件。但是我正在使用NullPointerException。任何人都可以帮忙解决这个问题吗?文件名作为参数提供。

public class Placeholder extends Configured implements Tool  
{
    private static Path[] cal_filepath;
    private static String filename;
    private static String[] calendartablecolumn;
    private static BufferedReader br;
    private static String placeholder;
    private static String output;
    private static HashMap<String,String> startdate = new HashMap<String,String>();
    private static HashMap<String,String> enddate = new HashMap<String,String>();

    private static FileSplit fileSplit;


    public int run(String[] args) throws Exception {

        String placeholdername = args[3];
        String stratposition = args[4];
        String length = args[5];
        String category = args[6];
        String calendarid = args[7];
        System.out.println("Taking args");

        Job job = new Job(getConf(), "Placeholder");
        DistributedCache.addCacheFile(new Path(args[2]).toUri(), getConf());
        System.out.println("cache");

        job.setJarByClass(Placeholder.class);
        job.setMapperClass(PlaceholderMapper.class);

        Configuration conf=job.getConfiguration();
        conf.set("placeholdername", placeholdername);
        conf.set("stratposition", stratposition);
        conf.set("length", length);
        conf.set("category", category);
        conf.set("calendarid", calendarid);

        job.setNumReduceTasks(0);
        System.out.println("mapper setting");


        job.setOutputKeyClass(Text.class);
        job.setOutputValueClass(Text.class);

        job.setInputFormatClass(TextInputFormat.class);
        job.setOutputFormatClass(TextOutputFormat.class);

        FileInputFormat.addInputPath(job, new Path(args[0]));
        FileOutputFormat.setOutputPath(job, new Path(args[1]));

        return job.waitForCompletion(true) ? 0 : 1;

    }

    public static void main(String[] args) throws Exception {
        System.out.println("main");
        int res = ToolRunner.run(new Configuration(), new Placeholder(), args);     
        System.exit(res);
    }

    public static  class PlaceholderMapper extends Mapper<LongWritable,Text,Text,Text>
    {


        protected void setup(Context context) throws IOException,InterruptedException 
        {
            Configuration config = context.getConfiguration();
            System.out.println("set up");

            try {

                cal_filepath = DistributedCache.getLocalCacheFiles(config);
                br = new BufferedReader(new FileReader(cal_filepath[0].toString()));
                String line = null;
                while ((line = br.readLine()) != null) 
                {
                    calendartablecolumn = line.split("\\,");
                    System.out.println(calendartablecolumn);
                    startdate.put(calendartablecolumn[0].trim(),calendartablecolumn[3].trim());
                    enddate.put(calendartablecolumn[0].trim(),calendartablecolumn[4].trim());
                }
            } catch (Exception e) {
                e.printStackTrace();

            }
        }
    }
}

我得到的错误日志是:

java.lang.NullPointerException at  
com.nielsen.GRFE.processor.mapreduce.Placeholder$PlaceholderMapper.setup(Placeholder.java:100) at 
org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:142)   at  
org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:787)   at 
org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)   at  
org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163)   at 
java.security.AccessController.doPrivileged(Native Method)   at 
javax.security.auth.Subject.doAs(Subject.java:415)   at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671) at 
org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)   
log4j:WARN No appenders could be found for logger (org.apache.hadoop.metrics2.impl.MetricsSystemImpl).   

log4j:WARN Please initialize the log4j system properly.   
log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info.

1 个答案:

答案 0 :(得分:0)

尝试将getConf()中的DistributedCache.addCacheFile(new Path(args[2]).toUri(), getConf());替换为job.getConfiguration()。通过getConf方法修改的配置不会影响作业的配置。