我遵循了AWS湖形成文档[https://docs.aws.amazon.com/lake-formation/latest/dg/getting-started-tutorial.html]
中的教程我无法成功运行它,并且总是在同一步骤中失败。
在我的尝试中,工作流始终在“用于工作流lakeformationjdbctest的ETL作业”上失败。之前的作业(pre_crawl,pre_crawl_trigger,discovever,post_call_trigger,post_crawl和etl_trigger)成功运行。
工作流程结束后,似乎:
日志确实为我提供了问题所在的许多信息。有人可以指出我正确的方向吗,或者本教程中有问题吗? (有人成功完成了本教程吗?)
在调查有关失败作业的Cloudwatch日志时,我会看到以下内容:
19/08/15 10:18:55 INFO SecurityManager: Changing view acls to: root
19/08/15 10:18:55 INFO SecurityManager: Changing modify acls to: root
19/08/15 10:18:55 INFO SecurityManager: Changing view acls groups to:
19/08/15 10:18:55 INFO SecurityManager: Changing modify acls groups to:
19/08/15 10:18:55 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(root); groups with view permissions: Set(); users with modify permissions: Set(root); groups with modify permissions: Set()
19/08/15 10:18:55 INFO Client: Submitting application application_1565864071459_0001 to ResourceManager
19/08/15 10:18:55 INFO YarnClientImpl: Submitted application application_1565864071459_0001
19/08/15 10:18:56 INFO Client: Application report for application_1565864071459_0001 (state: ACCEPTED)
applicationid is application_1565864071459_0001, yarnRMDNS is 13.0.1.37
Application info reporting is enabled.
----------Recording application Id and Yarn RM DNS for cancellation-----------------
user: root
19/08/15 10:18:59 INFO Client: Application report for application_1565864071459_0001 (state: ACCEPTED)
client token: N/A
diagnostics: N/A
ApplicationMaster host: N/A
ApplicationMaster RPC port: -1
queue: default
start time: 1565864335326
final status: UNDEFINED
tracking URL: http://ip-13-0-1-37.eu-west-1.compute.internal:20888/proxy/application_1565864071459_0001/
...
BTW, previous section is occurring tens of times. So I don't add similar log entries here.
...
user: root
client token: N/A
diagnostics: Application application_1565864071459_0001 failed 1 times due to AM Container for appattempt_1565864071459_0001_000001 exited with exitCode: 10
For more detailed output, check application tracking page:http://169.254.76.1:8088/cluster/app/application_1565864071459_0001Then, click on links to logs of each attempt.
Diagnostics: Exception from container-launch.
Container id: container_1565864071459_0001_01_000001
Exit code: 10
Stack trace: ExitCodeException exitCode=10:
at org.apache.hadoop.util.Shell.runCommand(Shell.java:582)
at org.apache.hadoop.util.Shell.run(Shell.java:479)
at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:773)
at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:212)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:302)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:82)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Container exited with a non-zero exit code 10
Failing this attempt. Failing the application.
ApplicationMaster host: N/A
ApplicationMaster RPC port: -1
queue: default
start time: 1565864335326
final status: FAILED
tracking URL: http://169.254.76.1:8088/cluster/app/application_1565864071459_0001
user: root
Exception in thread "main" org.apache.spark.SparkException: Application application_1565864071459_0001 finished with failed status
at org.apache.spark.deploy.yarn.Client.run(Client.scala:1122)
at org.apache.spark.deploy.yarn.Client$.main(Client.scala:1168)
at org.apache.spark.deploy.yarn.Client.main(Client.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:775)
at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:180)
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:205)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:119)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
19/08/15 10:20:43 INFO ShutdownHookManager: Shutdown hook called
19/08/15 10:20:43 INFO ShutdownHookManager: Deleting directory /tmp/spark-18be0c1b-b40d-4dda-89b6-8f5bbfcdc1a2
Container: container_1565864071459_0001_01_000001 on ip-13-0-1-222.eu-west-1.compute.internal_8041
====================================================================================================
LogType:stderr
Log Upload Time:Thu Aug 15 10:20:44 +0000 2019
LogLength:4965
Log Contents:
ERROR StatusLogger No log4j2 configuration file found. Using default configuration: logging only errors to the console.
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/mnt/yarn/usercache/root/filecache/11/glue-assembly.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/mnt/yarn/usercache/root/filecache/18/__spark_libs__2975548331336565064.zip/slf4j-log4j12-1.7.16.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
Continuous Logging: Creating cloudwatch appender.
log4j:WARN No appenders could be found for logger (com.amazonaws.http.AmazonHttpClient).
log4j:WARN Please initialize the log4j system properly.
log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info.
Continuous Logging: Creating cloudwatch appender.
19/08/15 10:19:01 INFO AmazonHttpClient: Configuring Proxy. Proxy Host: 169.254.76.0 Proxy Port: 8888
19/08/15 10:19:01 INFO SignalUtils: Registered signal handler for TERM
19/08/15 10:19:01 INFO SignalUtils: Registered signal handler for HUP
19/08/15 10:19:01 INFO SignalUtils: Registered signal handler for INT
19/08/15 10:19:01 INFO ApplicationMaster: Preparing Local resources
19/08/15 10:19:02 INFO ApplicationMaster: ApplicationAttemptId: appattempt_1565864071459_0001_000001
19/08/15 10:19:02 INFO SecurityManager: Changing view acls to: yarn,root
19/08/15 10:19:02 INFO SecurityManager: Changing modify acls to: yarn,root
19/08/15 10:19:02 INFO SecurityManager: Changing view acls groups to:
19/08/15 10:19:02 INFO SecurityManager: Changing modify acls groups to:
19/08/15 10:19:02 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(yarn, root); groups with view permissions: Set(); users with modify permissions: Set(yarn, root); groups with modify permissions: Set()
19/08/15 10:19:02 INFO ApplicationMaster: Starting the user application in a separate Thread
19/08/15 10:19:02 INFO ApplicationMaster: Waiting for spark context initialization...
19/08/15 10:20:42 ERROR ApplicationMaster: Uncaught exception:
java.util.concurrent.TimeoutException: Futures timed out after [100000 milliseconds]
at scala.concurrent.impl.Promise$DefaultPromise.ready(Promise.scala:219)
at scala.concurrent.impl.Promise$DefaultPromise.result(Promise.scala:223)
at org.apache.spark.util.ThreadUtils$.awaitResult(ThreadUtils.scala:201)
at org.apache.spark.deploy.yarn.ApplicationMaster.runDriver(ApplicationMaster.scala:401)
at org.apache.spark.deploy.yarn.ApplicationMaster.run(ApplicationMaster.scala:254)
at org.apache.spark.deploy.yarn.ApplicationMaster$$anonfun$main$1.apply$mcV$sp(ApplicationMaster.scala:764)
at org.apache.spark.deploy.SparkHadoopUtil$$anon$2.run(SparkHadoopUtil.scala:67)
at org.apache.spark.deploy.SparkHadoopUtil$$anon$2.run(SparkHadoopUtil.scala:66)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1698)
at org.apache.spark.deploy.SparkHadoopUtil.runAsSparkUser(SparkHadoopUtil.scala:66)
at org.apache.spark.deploy.yarn.ApplicationMaster$.main(ApplicationMaster.scala:762)
at org.apache.spark.deploy.yarn.ApplicationMaster.main(ApplicationMaster.scala)
19/08/15 10:20:42 INFO ApplicationMaster: Final app status: FAILED, exitCode: 10, (reason: Uncaught exception: java.util.concurrent.TimeoutException: Futures timed out after [100000 milliseconds])
Continuous Logging: Shutting down cloudwatch appender.
Continuous Logging: Creating log group /aws-glue/jobs/logs-v2
Continuous Logging: Shutting down cloudwatch appender.
19/08/15 10:20:42 INFO ApplicationMaster: Unregistering ApplicationMaster with FAILED (diag message: Uncaught exception: java.util.concurrent.TimeoutException: Futures timed out after [100000 milliseconds])
19/08/15 10:20:42 INFO ApplicationMaster: Deleting staging directory hdfs://13.0.1.37:8020/user/root/.sparkStaging/application_1565864071459_0001
19/08/15 10:20:42 INFO ShutdownHookManager: Shutdown hook called
Continuous Logging: Log group already exists when creating AWS CloudWatch log group /aws-glue/jobs/logs-v2.
Exception in thread "Thread-1" java.lang.NullPointerException
at com.amazonaws.services.glue.cloudwatch.CloudWatchLogsAppenderCommon.logStreamPerInstance(CloudWatchLogsAppenderCommon.java:84)
at com.amazonaws.services.glue.cloudwatch.CloudWatchLogsAppenderCommon.flushLogEvents(CloudWatchLogsAppenderCommon.java:120)
at com.amazonaws.services.glue.cloudwatch.CloudWatchLogsAppenderCommon.flushBufferQueue(CloudWatchLogsAppenderCommon.java:127)
at com.amazonaws.services.glue.cloudwatch.CloudWatchLogsAppenderCommon.destroy(CloudWatchLogsAppenderCommon.java:309)
at java.lang.Thread.run(Thread.java:748)
End of LogType:stderr
LogType:stdout
Log Upload Time:Thu Aug 15 10:20:44 +0000 2019
LogLength:0
Log Contents:
End of LogType:stdout