映射缩小程序以找到最高温度

时间:2018-09-19 16:46:41

标签: hadoop mapreduce

我已经编写了map reduce程序,但是reducer无法正常工作,下面是我编写的代码。请让我知道程序中有什么错误,因为我没有收到任何错误,请帮助我。

下面是数据

temp1.txt

1993 23
1991 25
1992 56
1991 78 

temp2.​​txt

1991 11
1993 24
1992 35

映射器:

package p1;
import org.apache.hadoop.mapreduce.Mapper;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.io.LongWritable;
import org.apache.hadoop.io.IntWritable;
import java.io.*;

public class mymaaper extends Mapper <LongWritable,Text,Text, IntWritable>
{


    public void map(LongWritable key, Text value, Context con) throws IOException, InterruptedException
    {
        String arr1[]= value.toString().split("\\s");
                String year = arr1[0];
                int temp = Integer.parseInt(arr1[1]);

            con.write(new Text(year), new IntWritable(temp));
            //con.write(new Text(year), new Text(year));

        System.out.println(year+""+temp);

    }

}

减速器:

package p1;

import java.io.*;

import org.apache.hadoop.mapreduce.Reducer;
import org.apache.hadoop.io.Text;

import org.apache.hadoop.io.IntWritable;

public class myreducer extends Reducer <Text, IntWritable, Text, IntWritable> 
{
    public myreducer()
    {
        System.out.println("myreducer().hashcode="+ hashCode());
    }

public void reduce(Text key, Iterable<IntWritable> value, Context con) throws IOException, InterruptedException
{
    System.out.println("reduce(-,-,-)");

    System.out.println("context="+con);

    System.out.println("key="+key);

    System.out.print("All values=");



    int maxvalue =Integer.MIN_VALUE;
for(IntWritable sw:value)
{
maxvalue = Math.max(maxvalue, sw.get());    

}
con.write(key, new IntWritable(maxvalue));
}
}

驱动程序:

package p1;

import java.io.*;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;
import org.apache.hadoop.mapreduce.Job;

public class mydriver
{
public static void main(String args[]) throws ClassNotFoundException, IOException, InterruptedException
{
    Path input= new Path("hdfs://localhost:9000/input_temp/");
    Path output= new Path("hdfs://localhost:9000/output_temp/");

    Configuration conf= new Configuration();
    Job j1= Job.getInstance(conf, "maxtemp");

    j1.setJarByClass(mydriver.class);
    j1.setMapperClass(mymaaper.class);
    j1.setReducerClass(myreducer.class);


    j1.setOutputKeyClass(Text.class);
    j1.setOutputValueClass(IntWritable.class);

FileInputFormat.addInputPath(j1,input);
FileOutputFormat.setOutputPath(j1,output);

output.getFileSystem(conf).delete(output, true);
System.exit(j1.waitForCompletion(true)? 0 : 1);


}
}

o / p:

2018-09-19 09:42:13,222 WARN  util.NativeCodeLoader (NativeCodeLoader.java:<clinit>(60)) - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
2018-09-19 09:42:22,319 INFO  beanutils.FluentPropertyBeanIntrospector (FluentPropertyBeanIntrospector.java:introspect(147)) - Error when creating PropertyDescriptor for public final void org.apache.hadoop.shaded.org.apache.commons.configuration2.AbstractConfiguration.setProperty(java.lang.String,java.lang.Object)! Ignoring this property.
2018-09-19 09:42:22,864 INFO  impl.MetricsConfig (MetricsConfig.java:loadFirst(121)) - loaded properties from hadoop-metrics2.properties
2018-09-19 09:42:23,829 INFO  impl.MetricsSystemImpl (MetricsSystemImpl.java:startTimer(374)) - Scheduled Metric snapshot period at 0 second(s).
2018-09-19 09:42:23,834 INFO  impl.MetricsSystemImpl (MetricsSystemImpl.java:start(191)) - JobTracker metrics system started
2018-09-19 09:42:26,003 WARN  mapreduce.JobResourceUploader (JobResourceUploader.java:uploadResourcesInternal(147)) - Hadoop command-line option parsing not performed. Implement the Tool interface and execute your application with ToolRunner to remedy this.
2018-09-19 09:42:26,053 WARN  mapreduce.JobResourceUploader (JobResourceUploader.java:uploadJobJar(480)) - No job jar file set.  User classes may not be found. See Job or Job#setJar(String).
2018-09-19 09:42:27,001 INFO  input.FileInputFormat (FileInputFormat.java:listStatus(292)) - Total input files to process : 2
2018-09-19 09:42:27,512 INFO  mapreduce.JobSubmitter (JobSubmitter.java:submitJobInternal(205)) - number of splits:2
2018-09-19 09:42:29,048 INFO  mapreduce.JobSubmitter (JobSubmitter.java:printTokens(301)) - Submitting tokens for job: job_local342787376_0001
2018-09-19 09:42:29,068 INFO  mapreduce.JobSubmitter (JobSubmitter.java:printTokens(302)) - Executing with tokens: []
2018-09-19 09:42:30,382 INFO  mapreduce.Job (Job.java:submit(1574)) - The url to track the job: http://localhost:8080/
2018-09-19 09:42:30,387 INFO  mapreduce.Job (Job.java:monitorAndPrintJob(1619)) - Running job: job_local342787376_0001
2018-09-19 09:42:30,408 INFO  mapred.LocalJobRunner (LocalJobRunner.java:createOutputCommitter(501)) - OutputCommitter set in config null
2018-09-19 09:42:30,469 INFO  output.FileOutputCommitter (FileOutputCommitter.java:<init>(140)) - File Output Committer Algorithm version is 2
2018-09-19 09:42:30,478 INFO  output.FileOutputCommitter (FileOutputCommitter.java:<init>(155)) - FileOutputCommitter skip cleanup _temporary folders under output directory:false, ignore cleanup failures: false
2018-09-19 09:42:30,539 INFO  mapred.LocalJobRunner (LocalJobRunner.java:createOutputCommitter(519)) - OutputCommitter is org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter
2018-09-19 09:42:31,402 INFO  mapred.LocalJobRunner (LocalJobRunner.java:runTasks(478)) - Waiting for map tasks
2018-09-19 09:42:31,416 INFO  mapred.LocalJobRunner (LocalJobRunner.java:run(252)) - Starting task: attempt_local342787376_0001_m_000000_0
2018-09-19 09:42:31,444 INFO  mapreduce.Job (Job.java:monitorAndPrintJob(1640)) - Job job_local342787376_0001 running in uber mode : false
2018-09-19 09:42:31,447 INFO  mapreduce.Job (Job.java:monitorAndPrintJob(1647)) -  map 0% reduce 0%
2018-09-19 09:42:31,768 INFO  output.FileOutputCommitter (FileOutputCommitter.java:<init>(140)) - File Output Committer Algorithm version is 2
2018-09-19 09:42:31,778 INFO  output.FileOutputCommitter (FileOutputCommitter.java:<init>(155)) - FileOutputCommitter skip cleanup _temporary folders under output directory:false, ignore cleanup failures: false
2018-09-19 09:42:32,028 INFO  mapred.Task (Task.java:initialize(625)) -  Using ResourceCalculatorProcessTree : [ ]
2018-09-19 09:42:32,085 INFO  mapred.MapTask (MapTask.java:runNewMapper(768)) - Processing split: hdfs://localhost:9000/input_temp/temp1:0+41
2018-09-19 09:42:33,881 INFO  mapred.MapTask (MapTask.java:setEquator(1219)) - (EQUATOR) 0 kvi 26214396(104857584)
2018-09-19 09:42:33,888 INFO  mapred.MapTask (MapTask.java:init(1012)) - mapreduce.task.io.sort.mb: 100
2018-09-19 09:42:33,888 INFO  mapred.MapTask (MapTask.java:init(1013)) - soft limit at 83886080
2018-09-19 09:42:33,889 INFO  mapred.MapTask (MapTask.java:init(1014)) - bufstart = 0; bufvoid = 104857600
2018-09-19 09:42:33,890 INFO  mapred.MapTask (MapTask.java:init(1015)) - kvstart = 26214396; length = 6553600
2018-09-19 09:42:33,964 INFO  mapred.MapTask (MapTask.java:createSortingCollector(409)) - Map output collector class = org.apache.hadoop.mapred.MapTask$MapOutputBuffer
199121
1992-5
199310
199152
1993-67
2018-09-19 09:42:35,960 INFO  mapred.LocalJobRunner (LocalJobRunner.java:statusUpdate(628)) - 
2018-09-19 09:42:35,992 INFO  mapred.MapTask (MapTask.java:flush(1476)) - Starting flush of map output
2018-09-19 09:42:36,001 INFO  mapred.MapTask (MapTask.java:flush(1498)) - Spilling map output
2018-09-19 09:42:36,001 INFO  mapred.MapTask (MapTask.java:flush(1499)) - bufstart = 0; bufend = 45; bufvoid = 104857600
2018-09-19 09:42:36,007 INFO  mapred.MapTask (MapTask.java:flush(1501)) - kvstart = 26214396(104857584); kvend = 26214380(104857520); length = 17/6553600
2018-09-19 09:42:36,175 INFO  mapred.MapTask (MapTask.java:sortAndSpill(1696)) - Finished spill 0
2018-09-19 09:42:36,337 INFO  mapred.Task (Task.java:done(1232)) - Task:attempt_local342787376_0001_m_000000_0 is done. And is in the process of committing
2018-09-19 09:42:36,419 INFO  mapred.LocalJobRunner (LocalJobRunner.java:statusUpdate(628)) - map
2018-09-19 09:42:36,426 INFO  mapred.Task (Task.java:sendDone(1368)) - Task 'attempt_local342787376_0001_m_000000_0' done.
2018-09-19 09:42:36,571 INFO  mapred.Task (Task.java:done(1264)) - Final Counters for attempt_local342787376_0001_m_000000_0: Counters: 22
    File System Counters
        FILE: Number of bytes read=267
        FILE: Number of bytes written=495006
        FILE: Number of read operations=0
        FILE: Number of large read operations=0
        FILE: Number of write operations=0
        HDFS: Number of bytes read=41
        HDFS: Number of bytes written=0
        HDFS: Number of read operations=5
        HDFS: Number of large read operations=0
        HDFS: Number of write operations=2
    Map-Reduce Framework
        Map input records=5
        Map output records=5
        Map output bytes=45
        Map output materialized bytes=61
        Input split bytes=103
        Combine input records=0
        Spilled Records=5
        Failed Shuffles=0
        Merged Map outputs=0
        GC time elapsed (ms)=339
        Total committed heap usage (bytes)=167841792
    File Input Format Counters 
        Bytes Read=41
2018-09-19 09:42:36,578 INFO  mapred.LocalJobRunner (LocalJobRunner.java:run(277)) - Finishing task: attempt_local342787376_0001_m_000000_0
2018-09-19 09:42:36,581 INFO  mapred.LocalJobRunner (LocalJobRunner.java:run(252)) - Starting task: attempt_local342787376_0001_m_000001_0
2018-09-19 09:42:36,606 INFO  output.FileOutputCommitter (FileOutputCommitter.java:<init>(140)) - File Output Committer Algorithm version is 2
2018-09-19 09:42:36,607 INFO  output.FileOutputCommitter (FileOutputCommitter.java:<init>(155)) - FileOutputCommitter skip cleanup _temporary folders under output directory:false, ignore cleanup failures: false
2018-09-19 09:42:36,609 INFO  mapred.Task (Task.java:initialize(625)) -  Using ResourceCalculatorProcessTree : [ ]
2018-09-19 09:42:36,644 INFO  mapreduce.Job (Job.java:monitorAndPrintJob(1647)) -  map 100% reduce 0%
2018-09-19 09:42:36,668 INFO  mapred.MapTask (MapTask.java:runNewMapper(768)) - Processing split: hdfs://localhost:9000/input_temp/temp2:0+33
2018-09-19 09:42:37,175 INFO  mapred.MapTask (MapTask.java:setEquator(1219)) - (EQUATOR) 0 kvi 26214396(104857584)
2018-09-19 09:42:37,180 INFO  mapred.MapTask (MapTask.java:init(1012)) - mapreduce.task.io.sort.mb: 100
2018-09-19 09:42:37,183 INFO  mapred.MapTask (MapTask.java:init(1013)) - soft limit at 83886080
2018-09-19 09:42:37,187 INFO  mapred.MapTask (MapTask.java:init(1014)) - bufstart = 0; bufvoid = 104857600
2018-09-19 09:42:37,187 INFO  mapred.MapTask (MapTask.java:init(1015)) - kvstart = 26214396; length = 6553600
2018-09-19 09:42:37,199 INFO  mapred.MapTask (MapTask.java:createSortingCollector(409)) - Map output collector class = org.apache.hadoop.mapred.MapTask$MapOutputBuffer
199246
1993-9
199188
1992-2
2018-09-19 09:42:37,354 INFO  mapred.MapTask (MapTask.java:flush(1476)) - Starting flush of map output
2018-09-19 09:42:37,355 INFO  mapred.MapTask (MapTask.java:flush(1498)) - Spilling map output
2018-09-19 09:42:37,355 INFO  mapred.MapTask (MapTask.java:flush(1499)) - bufstart = 0; bufend = 36; bufvoid = 104857600
2018-09-19 09:42:37,355 INFO  mapred.MapTask (MapTask.java:flush(1501)) - kvstart = 26214396(104857584); kvend = 26214384(104857536); length = 13/6553600
2018-09-19 09:42:37,419 INFO  mapred.MapTask (MapTask.java:sortAndSpill(1696)) - Finished spill 0
2018-09-19 09:42:37,480 INFO  mapred.LocalJobRunner (LocalJobRunner.java:runTasks(486)) - map task executor complete.
2018-09-19 09:42:37,498 WARN  mapred.LocalJobRunner (LocalJobRunner.java:run(590)) - job_local342787376_0001
java.lang.Exception: java.lang.ArrayIndexOutOfBoundsException: 1
    at org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:492)
    at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:552)
Caused by: java.lang.ArrayIndexOutOfBoundsException: 1
    at p1.mymaaper.map(mymaaper.java:16)
    at p1.mymaaper.map(mymaaper.java:1)
    at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:146)
    at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:799)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:347)
    at org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:271)
    at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
    at java.util.concurrent.FutureTask.run(FutureTask.java:266)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
    at java.lang.Thread.run(Thread.java:748)
2018-09-19 09:42:37,648 INFO  mapreduce.Job (Job.java:monitorAndPrintJob(1660)) - Job job_local342787376_0001 failed with state FAILED due to: NA
2018-09-19 09:42:37,786 INFO  mapreduce.Job (Job.java:monitorAndPrintJob(1665)) - Counters: 22
    File System Counters
        FILE: Number of bytes read=267
        FILE: Number of bytes written=495006
        FILE: Number of read operations=0
        FILE: Number of large read operations=0
        FILE: Number of write operations=0
        HDFS: Number of bytes read=41
        HDFS: Number of bytes written=0
        HDFS: Number of read operations=5
        HDFS: Number of large read operations=0
        HDFS: Number of write operations=2
    Map-Reduce Framework
        Map input records=5
        Map output records=5
        Map output bytes=45
        Map output materialized bytes=61
        Input split bytes=103
        Combine input records=0
        Spilled Records=5
        Failed Shuffles=0
        Merged Map outputs=0
        GC time elapsed (ms)=339
        Total committed heap usage (bytes)=167841792
    File Input Format Counters 
        Bytes Read=41

1 个答案:

答案 0 :(得分:0)

您摆脱了异常限制。我认为您的输入文件中记录不好。使用前请在映射器中检查arr1大小。