我经常在堆栈跟踪中获得以下内容
WARN TransportChannelHandler: Exception in connection from /172.31.3.245:46014
java.io.IOException: Connection reset by peer
at sun.nio.ch.FileDispatcherImpl.read0(Native Method)
at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39)
at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223)
at sun.nio.ch.IOUtil.read(IOUtil.java:192)
at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:380)
at io.netty.buffer.PooledUnsafeDirectByteBuf.setBytes(PooledUnsafeDirectByteBuf.java:221)
at io.netty.buffer.AbstractByteBuf.writeBytes(AbstractByteBuf.java:898)
at io.netty.channel.socket.nio.NioSocketChannel.doReadBytes(NioSocketChannel.java:242)
at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:119)
at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:511)
at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:468)
at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:382)
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:354)
at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:112)
最终我在设备错误上没有剩余空间,但在研究之后我发现我可以.set("spark.local.dir", "/home/ubuntu/sparktempdata");
这减少了我的跟踪“设备上没有剩余空间”错误的频率但是还剩下一个如下面的那个,我不知道如何解决它?
16/09/06 08:34:18 ERROR FileAppender: Error writing stream to file /usr/local/spark/work/app-20160906083355-0000/1/stderr
java.io.IOException: No space left on device
at java.io.FileOutputStream.writeBytes(Native Method)
at java.io.FileOutputStream.write(FileOutputStream.java:326)
at org.apache.spark.util.logging.FileAppender.appendToFile(FileAppender.scala:92)
at org.apache.spark.util.logging.FileAppender$$anonfun$appendStreamToFile$1.apply$mcV$sp(FileAppender.scala:75)
at org.apache.spark.util.logging.FileAppender$$anonfun$appendStreamToFile$1.apply(FileAppender.scala:62)
at org.apache.spark.util.logging.FileAppender$$anonfun$appendStreamToFile$1.apply(FileAppender.scala:62)
at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1287)
at org.apache.spark.util.logging.FileAppender.appendStreamToFile(FileAppender.scala:78)
at org.apache.spark.util.logging.FileAppender$$anon$1$$anonfun$run$1.apply$mcV$sp(FileAppender.scala:39)
at org.apache.spark.util.logging.FileAppender$$anon$1$$anonfun$run$1.apply(FileAppender.scala:39)
at org.apache.spark.util.logging.FileAppender$$anon$1$$anonfun$run$1.apply(FileAppender.scala:39)
at org.apache.spark.util.Utils$.logUncaughtExceptions(Utils.scala:1857)
at org.apache.spark.util.logging.FileAppender$$anon$1.run(FileAppender.scala:38)
当我打开文件/ usr / local / spark / work / app-20160906083355-0000 / 1 / stderr时,我看到以下内容
INFO Utils: Fetching spark://172.31.11.187:58519/jars/analytics-1.0-SNAPSHOT.jar to /tmp/spark-69b1866b-f302-4ab8-a25f-f2a8cc1f4b4f/executor-99c9eeb0-d45c-4619-8054-7f6d3f15803c/spark-c28a16b5-5ac5-440b-9e4d-7ed1b1b8bcbe/fetchFileTemp6564441043886275791.tmp
16/09/06 08:34:18 WARN TransportChannelHandler: Exception in connection from /172.31.11.187:58519
java.io.IOException: Broken pipe
at sun.nio.ch.FileDispatcherImpl.write0(Native Method)
at sun.nio.ch.FileDispatcherImpl.write(FileDispatcherImpl.java:60)
at sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:93)
at sun.nio.ch.IOUtil.write(IOUtil.java:51)
at sun.nio.ch.SinkChannelImpl.write(SinkChannelImpl.java:167)
at org.apache.spark.rpc.netty.NettyRpcEnv$FileDownloadCallback.onData(NettyRpcEnv.scala:395)
at org.apache.spark.network.client.StreamInterceptor.handle(StreamInterceptor.java:69)
at org.apache.spark.network.util.TransportFrameDecoder.feedInterceptor(TransportFrameDecoder.java:202)
at org.apache.spark.network.util.TransportFrameDecoder.channelRead(TransportFrameDecoder.java:70)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:308)
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:294)
at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:846)
at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:131)
at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:511)
at io.netty.channel.nio.NioEventLoop.pro
425,2-9 Bot
这是我的工作节点上的df -h。此外,我的所有机器都有相同数量的资源
Filesystem Size Used Avail Use% Mounted on
udev 7.4G 12K 7.4G 1% /dev
tmpfs 1.5G 344K 1.5G 1% /run
/dev/xvda1 7.8G 7.3G 92M 99% /
none 4.0K 0 4.0K 0% /sys/fs/cgroup
none 5.0M 0 5.0M 0% /run/lock
none 7.4G 0 7.4G 0% /run/shm
none 100M 0 100M 0% /run/user
/dev/xvdb 37G 49M 35G 1% /mnt
答案 0 :(得分:-1)
我认为这个错误是由于管道损坏造成的。基本上是客户端(假设您的笔记本电脑很长时间没有从服务器听到任何声音,因此它假设它已经不再连接了。使用SIGPIPE命令并将其设置为2分钟。以下链接将帮助您。