Question

我的程序运行多个Map reduce作业，对于我要传递给它的参数文件中的每一行参数，都执行一次。

主要功能如下：

public static void main(String[] args) throws Exception {
        // Create configuration
        Configuration conf = new Configuration();

        if (args.length != 3) 
        {
            System.err.println("Usage: KnnPattern <in> <out> <parameter file>");
            System.exit(2);
        }

        //Reading argument using Hadoop API now
        conf.set ("params", (args[2]));
        String param = conf.get("params");
        StringTokenizer inputLine = new StringTokenizer(param, "|");

        int n = 1;
        while(inputLine.hasMoreTokens())
        {

            conf.set("passedVal", inputLine.nextToken());

            //Job Configuration here

            ++n;
        }}

主要功能读取第三个参数，即存储在HDFS中的Parameters文件，并为其运行的每个MapReduce作业传递1个参数字符串。或者至少那是我想要的。我不确定这是否完全正确。

我的Mapper的设置如下所示：

        protected void setup(Context context) throws IOException, InterruptedException
    {
            // Read parameter file using alias established in main()
            Configuration conf = context.getConfiguration();
            String knnParams = conf.get("passedVal");

            StringTokenizer st = new StringTokenizer(knnParams, ",");

            // Using the variables declared earlier, values are assigned to K and to the test dataset, S.
            // These values will remain unchanged throughout the mapper
            K = Integer.parseInt(st.nextToken());
            normalisedSAge = normalisedDouble(st.nextToken(), minAge, maxAge);
            normalisedSIncome = normalisedDouble(st.nextToken(), minIncome, maxIncome);
            sStatus = st.nextToken();
            sGender = st.nextToken();
            normalisedSChildren = normalisedDouble(st.nextToken(), minChildren, maxChildren);

    }

我的参数文件包含以下内容：

67、16668，单身，男性，3 | 40、25000，单身，男性，2 | 67、16668，单身，男性，3

这是3组输入，用'|'分隔。

我得到的运行时错误是这样的：

错误：java.lang.NumberFormatException：对于输入字符串：“ /KNN/PARAMS/paramFinal.txt”       在java.lang.NumberFormatException.forInputString（NumberFormatException.java:65）       在java.lang.Integer.parseInt（Integer.java:569）       在java.lang.Integer.parseInt（Integer.java:615）       在KnnPattern $ KnnMapper.setup（KnnPattern.java:168）       在org.apache.hadoop.mapreduce.Mapper.run（Mapper.java:143）       在org.apache.hadoop.mapred.MapTask.runNewMapper（MapTask.java:787）       在org.apache.hadoop.mapred.MapTask.run（MapTask.java:341）       在org.apache.hadoop.mapred.YarnChild $ 2.run（YarnChild.java:164）       在java.security.AccessController.doPrivileged（本机方法）       在javax.security.auth.Subject.doAs（Subject.java:422）       在org.apache.hadoop.security.UserGroupInformation.doAs（UserGroupInformation.java:1762）       在org.apache.hadoop.mapred.YarnChild.main（YarnChild.java:158）

被ApplicationMaster杀死的容器。集装箱被击on   请求。退出代码为143容器退出，退出代码非零   143

据我所知，这看起来像是类型转换错误（？），我不确定这是如何发生的，为什么会发生。

这段代码主要是我从这里得到的-https://github.com/matt-hicks/MapReduce-KNN/blob/master/KnnPattern.java

对于一组参数来说，它运行得很好，但我需要同时为多个参数或测试用例运行，以便进一步应用。

任何解决此问题的方法，或者至少知道为什么我确实收到此错误？很感谢任何形式的帮助。谢谢。

Answer 1

我弄清楚了为什么要得到NumberFormatException。

问题是我以字符串而不是HDFS中的文件位置的形式读取了第三个参数（args [2]），这就是错误日志显示的原因：

对于输入字符串：“ / KNN / PARAMS / paramFinal.txt”

我现在出于测试目的正在做的是，我没有提供文件位置，而是直接将输入文本作为第三个参数传递。这帮助我摆脱了这个特定的错误。

LiveData

希望这对以后遇到此问题的所有人有所帮助。干杯。

迭代的MapReduce作业发生NumberFormatException错误

1 个答案: