Hadoop 2 mapreduce作业在提交后挂起

时间:2014-09-16 04:37:51

标签: hadoop amazon-ec2 mapreduce yarn

我正在尝试在EC2集群上运行hadoop dictcp,但提交后该作业挂起。有谁知道问题的原因?感谢。

"2014-09-16 03:04:09,386 INFO  service.AbstractService (AbstractService.java:init(81)) - Service:org.apache.hadoop.yarn.client.YarnClientImpl is inited.
2014-09-16 03:04:09,502 INFO  service.AbstractService (AbstractService.java:start(94)) - Service:org.apache.hadoop.yarn.client.YarnClientImpl is started.
2014-09-16 03:04:10,557 WARN  httpclient.RestS3Service (RestS3Service.java:performRequest(393)) - Response '/olap_log%2Flog%2Fprod%2Fs3_tracking_log_csv%2Fstat_clicks%2F1_12%2F883%2F2014%2F06' - Unexpected response code 404, expected 200
2014-09-16 03:04:10,575 WARN  httpclient.RestS3Service (RestS3Service.java:performRequest(393)) - Response '/olap_log%2Flog%2Fprod%2Fs3_tracking_log_csv%2Fstat_clicks%2F1_12%2F883%2F2014%2F06_%24folder%24' - Unexpected response code 404, expected 200
2014-09-16 03:04:10,797 WARN  httpclient.RestS3Service (RestS3Service.java:performRequest(393)) - Response '/olap_log%2Flog%2Fprod%2Fs3_tracking_log_csv%2Fstat_clicks%2F1_12%2F883%2F2014%2F06' - Unexpected response code 404, expected 200
2014-09-16 03:04:10,955 WARN  httpclient.RestS3Service (RestS3Service.java:performRequest(393)) - Response '/olap_log%2Flog%2Fprod%2Fs3_tracking_log_csv%2Fstat_clicks%2F1_12%2F883%2F2014%2F06_%24folder%24' - Unexpected response code 404, expected 200
2014-09-16 03:04:11,319 WARN  httpclient.RestS3Service (RestS3Service.java:performRequest(393)) - Response '/olap_log%2Flog%2Fprod%2Fs3_tracking_log_csv%2Fstat_clicks%2F1_12%2F883%2F2014%2F06' - Unexpected response code 404, expected 200
2014-09-16 03:04:11,337 WARN  httpclient.RestS3Service (RestS3Service.java:performRequest(393)) - Response '/olap_log%2Flog%2Fprod%2Fs3_tracking_log_csv%2Fstat_clicks%2F1_12%2F883%2F2014%2F06_%24folder%24' - Unexpected response code 404, expected 200
2014-09-16 03:04:11,395 WARN  httpclient.RestS3Service (RestS3Service.java:performRequest(393)) - Response '/olap_log%2Flog%2Fprod%2Fs3_tracking_log_csv%2Fstat_clicks%2F1_12%2F883%2F2014%2F06' - Unexpected response code 404, expected 200
2014-09-16 03:04:13,265 WARN  conf.Configuration (Configuration.java:warnOnceIfDeprecated(824)) - io.sort.mb is deprecated. Instead, use mapreduce.task.io.sort.mb
2014-09-16 03:04:13,265 WARN  conf.Configuration (Configuration.java:warnOnceIfDeprecated(824)) - io.sort.factor is deprecated. Instead, use mapreduce.task.io.sort.factor
2014-09-16 03:04:14,285 INFO  service.AbstractService (AbstractService.java:init(81)) - Service:org.apache.hadoop.yarn.client.YarnClientImpl is inited.
2014-09-16 03:04:14,285 INFO  service.AbstractService (AbstractService.java:start(94)) - Service:org.apache.hadoop.yarn.client.YarnClientImpl is started.
2014-09-16 03:04:15,412 INFO  mapreduce.JobSubmitter (JobSubmitter.java:submitJobInternal(368)) - number of splits:21
2014-09-16 03:04:16,114 WARN  conf.Configuration (Configuration.java:warnOnceIfDeprecated(824)) - mapred.jar is deprecated. Instead, use mapreduce.job.jar
2014-09-16 03:04:16,116 WARN  conf.Configuration (Configuration.java:warnOnceIfDeprecated(824)) - mapred.map.tasks.speculative.execution is deprecated. Instead, use mapreduce.map.speculative
2014-09-16 03:04:16,116 WARN  conf.Configuration (Configuration.java:warnOnceIfDeprecated(824)) - mapred.reduce.tasks is deprecated. Instead, use mapreduce.job.reduces
2014-09-16 03:04:16,117 WARN  conf.Configuration (Configuration.java:warnOnceIfDeprecated(824)) - mapred.mapoutput.value.class is deprecated. Instead, use mapreduce.map.output.value.class
2014-09-16 03:04:16,117 WARN  conf.Configuration (Configuration.java:warnOnceIfDeprecated(824)) - mapreduce.map.class is deprecated. Instead, use mapreduce.job.map.class
2014-09-16 03:04:16,117 WARN  conf.Configuration (Configuration.java:warnOnceIfDeprecated(824)) - mapred.job.name is deprecated. Instead, use mapreduce.job.name
2014-09-16 03:04:16,117 WARN  conf.Configuration (Configuration.java:warnOnceIfDeprecated(824)) - mapreduce.inputformat.class is deprecated. Instead, use mapreduce.job.inputformat.class
2014-09-16 03:04:16,118 WARN  conf.Configuration (Configuration.java:warnOnceIfDeprecated(824)) - mapred.output.dir is deprecated. Instead, use mapreduce.output.fileoutputformat.outputdir
2014-09-16 03:04:16,118 WARN  conf.Configuration (Configuration.java:warnOnceIfDeprecated(824)) - mapreduce.outputformat.class is deprecated. Instead, use mapreduce.job.outputformat.class
2014-09-16 03:04:16,118 WARN  conf.Configuration (Configuration.java:warnOnceIfDeprecated(824)) - mapred.map.tasks is deprecated. Instead, use mapreduce.job.maps
2014-09-16 03:04:16,119 WARN  conf.Configuration (Configuration.java:warnOnceIfDeprecated(824)) - mapred.mapoutput.key.class is deprecated. Instead, use mapreduce.map.output.key.class
2014-09-16 03:04:16,119 WARN  conf.Configuration (Configuration.java:warnOnceIfDeprecated(824)) - mapred.working.dir is deprecated. Instead, use mapreduce.job.working.dir
2014-09-16 03:04:16,587 INFO  mapreduce.JobSubmitter (JobSubmitter.java:printTokens(438)) - Submitting tokens for job: job_1410832828185_0009
2014-09-16 03:04:17,592 INFO  client.YarnClientImpl (YarnClientImpl.java:submitApplication(124)) - Submitted application application_1410832828185_0009 to ResourceManager at /10.120.109.238:8032
2014-09-16 03:04:17,632 INFO  mapreduce.Job (Job.java:submit(1222)) - The url to track the job: http://ip-10-120-109-238.ec2.internal:8088/proxy/application_1410832828185_0009/
2014-09-16 03:04:17,632 INFO  tools.DistCp (DistCp.java:execute(164)) - DistCp job-id: job_1410832828185_0009
2014-09-16 03:04:17,633 INFO  mapreduce.Job (Job.java:monitorAndPrintJob(1267)) - Running job: job_1410832828185_0009"

我的yarn-site.xml: {

<?xml version="1.0"?>
<configuration>
  <property><name>yarn.nodemanager.container-executor.class</name><value>org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor</value></property>
  <property><name>yarn.nodemanager.aux-services</name><value>mapreduce_shuffle</value></property>
  <property><name>yarn.nodemanager.resource.memory-mb</name><value>64000</value></property>
   <property><name>yarn.scheduler.minimum-allocation-mb</name><value>2048</value></property>  
  <property><name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name><value>org.apache.hadoop.mapred.ShuffleHandler</value></property>
  <property><name>yarn.resourcemanager.resource-tracker.address</name><value>10.120.109.238:8031</value></property>
  <property><name>yarn.resourcemanager.scheduler.address</name><value>10.120.109.238:8030</value></property>
  <property><name>yarn.resourcemanager.address</name><value>10.120.109.238:8032</value></property>
   <property><name>yarn.resourcemanager.hostname</name><value>ec2-54-234-24-96.compute-1.amazonaws.com</value></property>
 </configuration>}

mapred-site.xml:
{<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>

<!-- Put site-specific property overrides in this file. -->

<configuration>
  <property><name>mapreduce.framework.name</name><value>yarn</value></property>

  <property>
      <name>fs.default.name</name>
      <value>hdfs://ec2-54-234-24-96.compute-1.amazonaws.com:9000</value>
  </property>

  <property>
    <name>mapred.job.tracker</name>
    <value>ec2-54-234-24-96.compute-1.amazonaws.com:9001</value>
  </property>
  <property>
      <name>mapred.map.tasks</name>
      <value>4</value>
  </property>

  <property>
      <name>mapred.reduce.tasks</name>
      <value>4</value>
  </property>

  <property>
      <name>mapred.tasktracker.map.tasks.maximum</name>
      <value>4</value>
  </property>

  <property>
      <name>mapred.tasktracker.reduce.tasks.maximum</name>
      <value>4</value>
  </property>
  <property><name>mapred.output.committer.class</name><value>org.apache.hadoop.mapred.DirectFileOutputCommitter</value></property>
  <property><name>mapreduce.reduce.java.opts</name><value>-Xmx6144m</value></property>
  <property><name>mapreduce.map.java.opts</name><value>-Xmx3072m</value></property>
  <property><name>mapreduce.reduce.shuffle.parallelcopies</name><value>32</value></property>
  <property><name>mapreduce.map.memory.mb</name><value>4096</value></property>
  <property><name>mapreduce.map.memory.mb</name><value>8192</value></property>
</configuration>}

core-site.xml:

{<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>

<!-- Put site-specific property overrides in this file. -->

<configuration>

  <property>
    <name>hadoop.tmp.dir</name>
    <value>/mnt/ephemeral-hdfs</value>
  </property>

  <property>
    <name>fs.default.name</name>
    <value>hdfs://ec2-54-234-24-96.compute-1.amazonaws.com:9000</value>
  </property>

  <property>
    <name>io.file.buffer.size</name>
    <value>65536</value>
  </property>

  <property>
    <name>dfs.client.read.shortcircuit</name>
    <value>false</value>
  </property>

  <property>
    <name>dfs.client.read.shortcircuit.skip.checksum</name>
    <value>false</value>
  </property>

  <property>
    <name>dfs.domain.socket.path</name>
    <value>/var/run/hadoop-hdfs/dn._PORT</value>
  </property>

  <property>
    <name>dfs.client.file-block-storage-locations.timeout</name>
    <value>3000</value>
  </property>

  <property>
    <name>fs.tachyon.impl</name>
    <value>tachyon.hadoop.TFS</value>
  </property>

</configuration>}

1 个答案:

答案 0 :(得分:0)

从提供的信息来看,这有点难以辨别,但我确实在您的输出中看到了这一点:

    Response '/olap_log/log/prod/s3_tracking_log_csv/stat_clicks/1_12/883/2014/06' - Unexpected response code 404, expected 200

尝试获取该文件时,作业收到404(未找到)错误,我认为这是您的输入文件。也许这就是问题所在?

如果您查看Job Tracker URL,那还有其他信息吗?