我正在为AWS etl管道进行胶水作业。胶水作业运行pyspark代码。 pyspark代码从ec2实例上的多个mysql数据库中提取,执行etl并合并结果。该代码对于某些数据库运行正常,而对于其他数据库则运行失败。数据中的某些字段填充率很低。下面是错误日志中针对失败的作业之一的最终错误消息。 pyspark代码的某些部分花费太长时间才能返回结果的问题吗?谁能从下面的消息中得知问题可能是什么?还是aws胶中有一种方法可以跟踪失败的工作并查看其死于何处?任何提示都将不胜感激。
错误日志:
60] storage.ShuffleBlockFetcherIterator (Logging.scala:logInfo(54)) - Started 0 remote fetches in 0 ms
2019-11-01 09:04:50,961 INFO [Executor task launch worker for task 7981] executor.Executor (Logging.scala:logInfo(54)) - Running task 199.0 in stage 133.0 (TID 7981)
2019-11-01 09:04:50,961 INFO [Executor task launch worker for task 7962] storage.ShuffleBlockFetcherIterator (Logging.scala:logInfo(54)) - Started 0 remote fetches in 0 ms
2019-11-01 09:04:50,962 INFO [Executor task launch worker for task 7962] executor.Executor (Logging.scala:logInfo(54)) - Finished task 180.0 in stage 133.0 (TID 7962). 3667 bytes result sent to driver
2019-11-01 09:04:50,963 INFO [Executor task launch worker for task 7960] executor.Executor (Logging.scala:logInfo(54)) - Finished task 177.0 in stage 133.0 (TID 7960). 3667 bytes result sent to driver
2019-11-01 09:04:50,963 INFO [Executor task launch worker for task 7981] storage.ShuffleBlockFetcherIterator (Logging.scala:logInfo(54)) - Getting 0 non-empty blocks including 0 local blocks and 0 remote blocks
2019-11-01 09:04:50,963 INFO [Executor task launch worker for task 7981] storage.ShuffleBlockFetcherIterator (Logging.scala:logInfo(54)) - Started 0 remote fetches in 0 ms
2019-11-01 09:04:50,964 INFO [Executor task launch worker for task 7981] executor.Executor (Logging.scala:logInfo(54)) - Finished task 199.0 in stage 133.0 (TID 7981). 3667 bytes result sent to driver
2019-11-01 09:04:54,140 INFO [dispatcher-event-loop-0] executor.CoarseGrainedExecutorBackend (Logging.scala:logInfo(54)) - Driver commanded a shutdown
2019-11-01 09:04:54,145 INFO [CoarseGrainedExecutorBackend-stop-executor] memory.MemoryStore (Logging.scala:logInfo(54)) - MemoryStore cleared
2019-11-01 09:04:54,145 INFO [CoarseGrainedExecutorBackend-stop-executor] storage.BlockManager (Logging.scala:logInfo(54)) - BlockManager stopped
2019-11-01 09:04:54,154 INFO [pool-7-thread-1] util.ShutdownHookManager (Logging.scala:logInfo(54)) - Shutdown hook called
End of LogType:stdout