Hadoop工作在地图上停留100%减少0%

时间:2017-10-12 05:01:49

标签: java hadoop nutch

我正在使用hadoop 2.7.2和nutch 1.12。当我在hadoop中运行nutch工作时,我在nutch解析阶段遇到以下错误。

17/10/03 14:01:52 INFO mapreduce.Job: Running job: job_1506573729189_0223
17/10/03 14:02:05 INFO mapreduce.Job: Job job_1506573729189_0223 running in uber mode : false
17/10/03 14:02:05 INFO mapreduce.Job:  map 0% reduce 0%
17/10/03 14:02:15 INFO mapreduce.Job:  map 1% reduce 0%
17/10/03 14:02:18 INFO mapreduce.Job:  map 2% reduce 0%
17/10/03 14:02:21 INFO mapreduce.Job:  map 3% reduce 0%
17/10/03 14:02:24 INFO mapreduce.Job:  map 4% reduce 0%
17/10/03 14:02:27 INFO mapreduce.Job:  map 8% reduce 0%
17/10/03 14:02:30 INFO mapreduce.Job:  map 12% reduce 0%
17/10/03 14:03:35 INFO mapreduce.Job: Task Id : attempt_1506573729189_0223_m_000000_0, Status : FAILED
Error: Java heap space
17/10/03 14:03:36 INFO mapreduce.Job:  map 11% reduce 0%
17/10/03 14:03:46 INFO mapreduce.Job:  map 14% reduce 0%
17/10/03 14:04:48 INFO mapreduce.Job: Task Id : attempt_1506573729189_0223_m_000000_1, Status : FAILED
Error: Java heap space

要删除上述错误,我将以下更改添加到hadoops mapred-site.xml。

<property>
    <name>mapred.child.java.opts</name>
    <value>-Xmx4096m</value>
  </property>

当我添加此属性时,我收到如下新错误。

17/10/04 11:06:15 INFO mapreduce.Job: Running job: job_1507094901386_0004
17/10/04 11:06:26 INFO mapreduce.Job: Job job_1507094901386_0004 running in uber mode : false
17/10/04 11:06:26 INFO mapreduce.Job:  map 0% reduce 0%
17/10/04 11:06:28 INFO mapreduce.Job: Task Id : attempt_1507094901386_0004_m_000000_0, Status : FAILED
Container [pid=8299,containerID=container_1507094901386_0004_01_000002] is running beyond virtual memory limits. Current usage: 97.6 MB of 1 GB physical memory used; 5.7 GB of 2.1 GB virtual memory used. Killing container.
Dump of the process-tree for container_1507094901386_0004_01_000002 :
    |- PID PPID PGRPID SESSID CMD_NAME USER_MODE_TIME(MILLIS) SYSTEM_TIME(MILLIS) VMEM_USAGE(BYTES) RSSMEM_USAGE(PAGES) FULL_CMD_LINE
    |- 8299 8297 8299 8299 (bash) 0 0 17092608 707 /bin/bash -c /usr/lib/jvm/java-8-openjdk-amd64/bin/java -Djava.net.preferIPv4Stack=true -Dhadoop.metrics.log.level=WARN  -Xmx4096m -Djava.io.tmpdir=/tmp/hadoop-hduser/nm-local-dir/usercache/hduser/appcache/application_1507094901386_0004/container_1507094901386_0004_01_000002/tmp -Dlog4j.configuration=container-log4j.properties -Dyarn.app.container.log.dir=/usr/local/hadoop/logs/userlogs/application_1507094901386_0004/container_1507094901386_0004_01_000002 -Dyarn.app.container.log.filesize=0 -Dhadoop.root.logger=INFO,CLA -Dhadoop.root.logfile=syslog org.apache.hadoop.mapred.YarnChild 192.168.1.63 33402 attempt_1507094901386_0004_m_000000_0 2 1>/usr/local/hadoop/logs/userlogs/application_1507094901386_0004/container_1507094901386_0004_01_000002/stdout 2>/usr/local/hadoop/logs/userlogs/application_1507094901386_0004/container_1507094901386_0004_01_000002/stderr  
    |- 8303 8299 8299 8299 (java) 73 4 6100692992 24291 /usr/lib/jvm/java-8-openjdk-amd64/bin/java -Djava.net.preferIPv4Stack=true -Dhadoop.metrics.log.level=WARN -Xmx4096m -Djava.io.tmpdir=/tmp/hadoop-hduser/nm-local-dir/usercache/hduser/appcache/application_1507094901386_0004/container_1507094901386_0004_01_000002/tmp -Dlog4j.configuration=container-log4j.properties -Dyarn.app.container.log.dir=/usr/local/hadoop/logs/userlogs/application_1507094901386_0004/container_1507094901386_0004_01_000002 -Dyarn.app.container.log.filesize=0 -Dhadoop.root.logger=INFO,CLA -Dhadoop.root.logfile=syslog org.apache.hadoop.mapred.YarnChild 192.168.1.63 33402 attempt_1507094901386_0004_m_000000_0 2 

通过在以下位置设置属性来删除这些错误  mapred-site.xml。我也删除了上面的属性&map; .redred.child.java.opts&#39;

<property>
    <name>mapreduce.map.memory.mb</name>
    <value>4096</value>
  </property>
  <property>
    <name>mapreduce.reduce.memory.mb</name>
    <value>8192</value>
  </property>
  <property>
    <name>mapreduce.map.java.opts</name>
    <value>-Xmx3072m</value>
  </property>
  <property>
    <name>mapreduce.reduce.java.opts</name>
    <value>-Xmx6144m</value>
  </property>

但即使我坚持下面这一行。

17/10/11 17:00:32 INFO mapreduce.Job: Running job: job_1507721357521_0001
17/10/11 17:00:56 INFO mapreduce.Job: Job job_1507721357521_0001 running in uber mode : false
17/10/11 17:00:56 INFO mapreduce.Job:  map 0% reduce 0%
17/10/11 17:01:08 INFO mapreduce.Job:  map 100% reduce 0%

0 个答案:

没有答案