EMR MapReduce错误:Java堆空间

时间:2017-08-08 10:29:21

标签: java hadoop amazon-s3 emr

我基本上阅读了SO上的所有帖子,但我仍然有内存问题。我使用Airflow在EMR上实例化我的工作,一切正常,直到两天前,每个工作都没有内存。作业从S3存储桶中读取,然后我进行一些聚合,然后将其保存回S3。 我有两种预定类型的工作 - 每小时和每天 - 我在其中提供不同的设置,如下所示:

public static Configuration getJobConfiguration(String interval) {

    Configuration jobConfiguration = new Configuration();

    jobConfiguration.set("job.name", "JobImport Job");

    jobConfiguration.set("mr.mapreduce.framework.name", "yarn);

    jobConfiguration.set("fs.defaultFS", "hdfs://xxxx:8020");
    jobConfiguration.set("dfs.client.use.datanode.hostname", "true");
    jobConfiguration.set("dfs.client.block.write.replace-datanode-on-failure.policy", "ALWAYS");
    jobConfiguration.set("dfs.client.block.write.replace-datanode-on-failure.best-effort", "true");

    if (interval.equals("hourly")) {
        jobConfiguration.set("mapreduce.input.fileinputformat.split.maxsize", "1000000");
        jobConfiguration.set("mapreduce.reduce.memory.mb", "6144"); // max memory for reducer
        jobConfiguration.set("mapreduce.map.memory.mb", "4098");    // max memory for mapper
        jobConfiguration.set("mapreduce.reduce.java.opts", "-Xmx4098m"); // reducer java heap
        jobConfiguration.set("mapreduce.map.java.opts", "-Xmx3075m");    // mapper java heap
    }
    else {
        jobConfiguration.set("mapreduce.input.fileinputformat.split.maxsize", "20000000");
        jobConfiguration.set("mapreduce.reduce.memory.mb", "8192"); // max memory for reducer
        jobConfiguration.set("mapreduce.map.memory.mb", "6144");    // max memory for mapper
        jobConfiguration.set("mapreduce.reduce.java.opts", "-Xmx6144m"); // reducer java heap
        jobConfiguration.set("mapreduce.map.java.opts", "-Xmx4098m");    // mapper java heap
    }

    return jobConfiguration;
}

我的mapper和Reduce看起来如下:

private JsonParser parser = new JsonParser();
private Text apiKeyText = new Text();
private Text eventsText = new Text();

@Override
public void map(LongWritable key, Text value, Context context) {

    String line = value.toString();
    String[] hourlyEvents = line.split("\n");

    JsonElement elem;
    JsonObject ev;

    try {
        for (String events : hourlyEvents) {

            elem = this.parser.parse(events);
            ev = elem.getAsJsonObject();

            if(!ev.has("api_key") || !ev.has("events")) {
                continue;
            }

            this.apiKeyText.set(ev.get("api_key").getAsString());
            this.eventsText.set(ev.getAsJsonArray("events").toString());

            context.write(this.apiKeyText, this.eventsText);
        }
    } catch (IOException | InterruptedException e) {
        logger.error(e.getMessage(), e);
    }
}

// ------------------
// Separate class
// ------------------

private JsonParser parser = new JsonParser();
private Text events = new Text();

@Override
public void reduce(Text key, Iterable<Text> values, Context context) {
    try {

        JsonObject obj = new JsonObject();
        JsonArray dailyEvents = new JsonArray();

        for (Text eventsTmp : values) {
            JsonArray tmp = this.parser.parse(eventsTmp.toString()).getAsJsonArray();
            for (JsonElement ev: tmp) {
                dailyEvents.add(ev);
            }
        }

        obj.addProperty("api_key", key.toString());
        obj.add("events", dailyEvents);

        this.events.set(obj.toString());

        context.write(NullWritable.get(), this.events);
    } catch (IOException | InterruptedException e) {
        logger.error(e.getMessage(), e);
    }
}

这是mapreduce作业之后的转储:

INFO mapreduce.Job: Counters: 56 File System Counters FILE: Number of bytes read=40 FILE: Number of bytes written=69703431 FILE: Number of read operations=0 FILE: Number of large read operations=0 FILE: Number of write operations=0 HDFS: Number of bytes read=3250 HDFS: Number of bytes written=0 HDFS: Number of read operations=58 HDFS: Number of large read operations=0 HDFS: Number of write operations=4 S3N: Number of bytes read=501370932 S3N: Number of bytes written=0 S3N: Number of read operations=0 S3N: Number of large read operations=0 S3N: Number of write operations=0 Job Counters Failed reduce tasks=4 Killed reduce tasks=1 Launched map tasks=26 Launched reduce tasks=6 Data-local map tasks=26 Total time spent by all maps in occupied slots (ms)=35841984 Total time spent by all reduces in occupied slots (ms)=93264640 Total time spent by all map tasks (ms)=186677 Total time spent by all reduce tasks (ms)=364315 Total vcore-milliseconds taken by all map tasks=186677 Total vcore-milliseconds taken by all reduce tasks=364315 Total megabyte-milliseconds taken by all map tasks=1146943488 Total megabyte-milliseconds taken by all reduce tasks=2984468480 Map-Reduce Framework Map input records=24 Map output records=24 Map output bytes=497227681 Map output materialized bytes=66055825 Input split bytes=3250 Combine input records=0 Combine output records=0 Reduce input groups=0 Reduce shuffle bytes=832 Reduce input records=0 Reduce output records=0 Spilled Records=24 Shuffled Maps =52 Failed Shuffles=0 Merged Map outputs=52 GC time elapsed (ms)=23254 CPU time spent (ms)=274180 Physical memory (bytes) snapshot=25311526912 Virtual memory (bytes) snapshot=183834742784 Total committed heap usage (bytes)=27816099840 Shuffle Errors BAD_ID=0 CONNECTION=0 IO_ERROR=0 WRONG_LENGTH=0 WRONG_MAP=0 WRONG_REDUCE=0 File Input Format Counters Bytes Read=501370932 File Output Format Counters Bytes Written=0

我正在使用的群集是emr-5.2.0,其中包含2个节点,其中每个节点都是 m3.xlarge 实例。从步骤中运行EMR I用户yarn jar ...中的jar(每个步骤都是从Airflow实例化的)。

除配置中的那些参数外,我不会更改任何其他内容,因此我使用默认值。如何解决错误:Java堆空间

0 个答案:

没有答案