我正在使用版本1.6的spark spark。
几天前,我的火花流媒体应用(上下文)突然关闭。查看日志,其中一个执行程序似乎已关闭。 (设备实际上已关闭。)
如果发生这种情况我该怎么办? (请注意,动态分配选项不可用。)
如果关闭执行程序,我希望将作业本身分配给另一个执行程序。我的应用程序在纱线客户端模式下运行。
## log example, at the time of shutdown.
WARN TransportChannelHandler: Exception in connection from xxxx-hostname/12.34.56.789:12345
ERROR TransportResponseHandler: Still have 2 requests outstanding when connection from xxxx-hostname/12.34.56.789:12345 is closed
ERROR ContextCleaner: Error cleaning broadcast 1123293
WARN BlockManagerMaster: Failed to remove RDD 262104
...
ERROR TransportClient: Failed to send RPC 5940957964172608257 to xxxx-hostname/12.34.56.789:12345: java.nio.channels.ClosedChannelException
...
WARN YarnSchedulerBackend$YarnSchedulerEndpoint: Attempted to get executor loss reason for executor id 5 at RPC address xxxx-hostname:12345, but got no response. Marking as slave lost. org.apache.spark.rpc.RpcTimeoutException: Cannot receive any reply in 120 seconds. This timeout is controlled by spark.rpc.askTimeout
答案 0 :(得分:0)
您的 hdfs 文件系统空间(数据节点空间)即将用完。