如果你能给我一些启发,我表示感谢。
我在Amazon EMR中运行字数统计地图减少时遇到问题,作为Spark步骤。 但我设法ssh到主节点并在spark-shell中运行字数统计逻辑而没有问题。
它抱怨主HDFS上不存在__spark_conf_xx.zip,虽然复制时没有错误
16/04/05 07:20:21 INFO yarn.Client: Uploading resource file:/mnt/tmp/spark-1d701ab0-7990-4ca2-bee2-099aed8e8e6b/__spark_conf__9006968814682693730.zip -> hdfs://ip-172-31-26-247.ap-northeast-1.compute.internal:8020/user/hadoop/.sparkStaging/application_1459839685827_0001/__spark_conf__9006968814682693730.zip
日志如下:
16/04/05 07:20:16 INFO client.RMProxy: Connecting to ResourceManager at ip-172-31-26-247.ap-northeast-1.compute.internal/172.31.26.247:8032
16/04/05 07:20:16 INFO yarn.Client: Requesting a new application from cluster with 2 NodeManagers
16/04/05 07:20:16 INFO yarn.Client: Verifying our application has not requested more than the maximum memory capability of the cluster (11520 MB per container)
16/04/05 07:20:16 INFO yarn.Client: Will allocate AM container, with 1408 MB memory including 384 MB overhead
16/04/05 07:20:16 INFO yarn.Client: Setting up container launch context for our AM
16/04/05 07:20:16 INFO yarn.Client: Setting up the launch environment for our AM container
16/04/05 07:20:16 INFO yarn.Client: Preparing resources for our AM container
16/04/05 07:20:17 INFO yarn.Client: Uploading resource file:/usr/lib/spark/lib/spark-assembly-1.6.1-hadoop2.7.2-amzn-0.jar -> hdfs://ip-172-31-26-247.ap-northeast-1.compute.internal:8020/user/hadoop/.sparkStaging/application_1459839685827_0001/spark-assembly-1.6.1-hadoop2.7.2-amzn-0.jar
16/04/05 07:20:18 INFO metrics.MetricsSaver: MetricsConfigRecord disabledInCluster: false instanceEngineCycleSec: 60 clusterEngineCycleSec: 60 disableClusterEngine: false maxMemoryMb: 3072 maxInstanceCount: 500 lastModified: 1459839695291
16/04/05 07:20:18 INFO metrics.MetricsSaver: Created MetricsSaver j-3AZL0AH5ALBBL:i-96753119:SparkSubmit:11699 period:60 /mnt/var/em/raw/i-96753119_20160405_SparkSubmit_11699_raw.bin
16/04/05 07:20:19 INFO metrics.MetricsSaver: 1 aggregated HDFSWriteDelay 2327 raw values into 1 aggregated values, total 1
16/04/05 07:20:20 INFO fs.EmrFileSystem: Consistency disabled, using com.amazon.ws.emr.hadoop.fs.s3n.S3NativeFileSystem as filesystem implementation
16/04/05 07:20:20 INFO yarn.Client: Uploading resource s3://gda-test/logic/wordCount.jar -> hdfs://ip-172-31-26-247.ap-northeast-1.compute.internal:8020/user/hadoop/.sparkStaging/application_1459839685827_0001/wordCount.jar
16/04/05 07:20:20 INFO s3n.S3NativeFileSystem: Opening 's3://gda-test/logic/wordCount.jar' for reading
16/04/05 07:20:20 INFO metrics.MetricsSaver: Thread 1 created MetricsLockFreeSaver 1
16/04/05 07:20:21 INFO metrics.MetricsSaver: 1 MetricsLockFreeSaver 1 comitted 33 matured S3ReadDelay values
16/04/05 07:20:21 INFO yarn.Client: Uploading resource file:/mnt/tmp/spark-1d701ab0-7990-4ca2-bee2-099aed8e8e6b/__spark_conf__9006968814682693730.zip -> hdfs://ip-172-31-26-247.ap-northeast-1.compute.internal:8020/user/hadoop/.sparkStaging/application_1459839685827_0001/__spark_conf__9006968814682693730.zip
16/04/05 07:20:21 INFO spark.SecurityManager: Changing view acls to: hadoop
16/04/05 07:20:21 INFO spark.SecurityManager: Changing modify acls to: hadoop
16/04/05 07:20:21 INFO spark.SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(hadoop); users with modify permissions: Set(hadoop)
16/04/05 07:20:21 INFO yarn.Client: Submitting application 1 to ResourceManager
16/04/05 07:20:21 INFO impl.YarnClientImpl: Submitted application application_1459839685827_0001
16/04/05 07:20:22 INFO yarn.Client: Application report for application_1459839685827_0001 (state: ACCEPTED)
16/04/05 07:20:22 INFO yarn.Client:
client token: N/A
diagnostics: N/A
ApplicationMaster host: N/A
ApplicationMaster RPC port: -1
queue: default
start time: 1459840821323
final status: UNDEFINED
tracking URL: http://ip-172-31-26-247.ap-northeast-1.compute.internal:20888/proxy/application_1459839685827_0001/
user: hadoop
16/04/05 07:20:23 INFO yarn.Client: Application report for application_1459839685827_0001 (state: ACCEPTED)
16/04/05 07:20:24 INFO yarn.Client: Application report for application_1459839685827_0001 (state: ACCEPTED)
16/04/05 07:20:25 INFO yarn.Client: Application report for application_1459839685827_0001 (state: ACCEPTED)
16/04/05 07:20:26 INFO yarn.Client: Application report for application_1459839685827_0001 (state: ACCEPTED)
16/04/05 07:20:27 INFO yarn.Client: Application report for application_1459839685827_0001 (state: ACCEPTED)
16/04/05 07:20:28 INFO yarn.Client: Application report for application_1459839685827_0001 (state: ACCEPTED)
16/04/05 07:20:29 INFO yarn.Client: Application report for application_1459839685827_0001 (state: ACCEPTED)
16/04/05 07:20:30 INFO yarn.Client: Application report for application_1459839685827_0001 (state: ACCEPTED)
16/04/05 07:20:31 INFO yarn.Client: Application report for application_1459839685827_0001 (state: FAILED)
16/04/05 07:20:31 INFO yarn.Client:
client token: N/A
diagnostics: Application application_1459839685827_0001 failed 2 times due to AM Container for appattempt_1459839685827_0001_000002 exited with exitCode: -1000
For more detailed output, check application tracking page:http://ip-172-31-26-247.ap-northeast-1.compute.internal:8088/cluster/app/application_1459839685827_0001Then, click on links to logs of each attempt.
Diagnostics: File does not exist: hdfs://ip-172-31-26-247.ap-northeast-1.compute.internal:8020/user/hadoop/.sparkStaging/application_1459839685827_0001/__spark_conf__9006968814682693730.zip
java.io.FileNotFoundException: File does not exist: hdfs://ip-172-31-26-247.ap-northeast-1.compute.internal:8020/user/hadoop/.sparkStaging/application_1459839685827_0001/__spark_conf__9006968814682693730.zip
at org.apache.hadoop.hdfs.DistributedFileSystem$22.doCall(DistributedFileSystem.java:1309)
at org.apache.hadoop.hdfs.DistributedFileSystem$22.doCall(DistributedFileSystem.java:1301)
at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1301)
at org.apache.hadoop.yarn.util.FSDownload.copy(FSDownload.java:253)
at org.apache.hadoop.yarn.util.FSDownload.access$000(FSDownload.java:63)
at org.apache.hadoop.yarn.util.FSDownload$2.run(FSDownload.java:361)
at org.apache.hadoop.yarn.util.FSDownload$2.run(FSDownload.java:359)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:358)
at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:62)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
Failing this attempt. Failing the application.
ApplicationMaster host: N/A
ApplicationMaster RPC port: -1
queue: default
start time: 1459840821323
final status: FAILED
tracking URL: http://ip-172-31-26-247.ap-northeast-1.compute.internal:8088/cluster/app/application_1459839685827_0001
user: hadoop
Exception in thread "main" org.apache.spark.SparkException: Application application_1459839685827_0001 finished with failed status
at org.apache.spark.deploy.yarn.Client.run(Client.scala:1034)
at org.apache.spark.deploy.yarn.Client$.main(Client.scala:1081)
at org.apache.spark.deploy.yarn.Client.main(Client.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:731)
at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:181)
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:206)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:121)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
16/04/05 07:20:31 INFO util.ShutdownHookManager: Shutdown hook called
16/04/05 07:20:31 INFO util.ShutdownHookManager: Deleting directory /mnt/tmp/spark-1d701ab0-7990-4ca2-bee2-099aed8e8e6b
Command exiting with ret '1'
答案 0 :(得分:5)
我找到了解决方案。
它是由Java版本不匹配引起的,因为逻辑和jar在Java8中,而EMR集群默认使用Java7。
在我的Spark& amp; Hadoop,我需要在创建集群时使用Advanced Option自定义env。 http://docs.aws.amazon.com/ElasticMapReduce/latest/ReleaseGuide/emr-configure-apps.html#configuring-java8
我希望这些信息对那些面临同样问题的人有用。