火花随机播放错误:不正确的标题或版本不匹配错误

时间:2018-07-18 16:01:10

标签: apache-spark pyspark hdfs jupyter-notebook yarn

我在YARN上运行Spark,从HDFS加载JSON文件时出现奇怪的错误。

我在Jupyter笔记本上使用PySpark,但它并没有到达实际收集数据的任何地方。

以下是我的代码:

df = spark.read.option("timestampFormat", "yyyy/MM/dd HH:mm:ss ZZ").json("hdfs://<server-ip>:8020/<path-prefix>/*")

然后,在装入大约350个分区后,我从yarn-yarn-nodemanager之一中获取以下错误。

2018-07-18 17:22:16,789 WARN  ipc.Server (Server.java:readAndProcess(1771)) - Incorrect header or version mismatch from <some ip>:55548 got version -13 expected version 9
2018-07-18 17:22:16,987 ERROR mapred.ShuffleHandler (ShuffleHandler.java:exceptionCaught(1261)) - Shuffle error: 
java.lang.IllegalArgumentException: invalid version format: ᄃ5!TᅱMヒ1¦0￴ᄁミ*TリナC4￳トFᄋ5��ヤ￀￀
    at org.jboss.netty.handler.codec.http.HttpVersion.<init>(HttpVersion.java:102)
    at org.jboss.netty.handler.codec.http.HttpVersion.valueOf(HttpVersion.java:62)
    at org.jboss.netty.handler.codec.http.HttpRequestDecoder.createMessage(HttpRequestDecoder.java:75)
    at org.jboss.netty.handler.codec.http.HttpMessageDecoder.decode(HttpMessageDecoder.java:189)
    at org.jboss.netty.handler.codec.http.HttpMessageDecoder.decode(HttpMessageDecoder.java:101)
    at org.jboss.netty.handler.codec.replay.ReplayingDecoder.callDecode(ReplayingDecoder.java:500)
    at org.jboss.netty.handler.codec.replay.ReplayingDecoder.messageReceived(ReplayingDecoder.java:435)
    at org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:268)
    at org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:255)
    at org.jboss.netty.channel.socket.nio.NioWorker.read(NioWorker.java:88)
    at org.jboss.netty.channel.socket.nio.AbstractNioWorker.process(AbstractNioWorker.java:107)
    at org.jboss.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:312)
    at org.jboss.netty.channel.socket.nio.AbstractNioWorker.run(AbstractNioWorker.java:88)
    at org.jboss.netty.channel.socket.nio.NioWorker.run(NioWorker.java:178)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
    at java.lang.Thread.run(Thread.java:745)
2018-07-18 17:22:16,991 ERROR mapred.ShuffleHandler (ShuffleHandler.java:exceptionCaught(1263)) - Shuffle error [id: 0x44bbf13f, /<same ip from before>:46646 => <node manager ip>:13562] EXCEPTION: java.lang.IllegalArgumentException: invalid version format: ᄃ5!TᅱMヒ1¦0￴ᄁミ*TリナC4￳トFᄋ5��ヤ￀￀
2018-07-18 17:22:17,024 ERROR mapred.ShuffleHandler (ShuffleHandler.java:exceptionCaught(1261)) - Shuffle error: 
java.lang.IllegalArgumentException: invalid version format: ᄃ5!TᅱMヒ1¦0￴ᄁミ*TリナC4￳トFᄋ5��ヤ￀￀
    at org.jboss.netty.handler.codec.http.HttpVersion.<init>(HttpVersion.java:102)
    at org.jboss.netty.handler.codec.http.HttpVersion.valueOf(HttpVersion.java:62)
    at org.jboss.netty.handler.codec.http.HttpRequestDecoder.createMessage(HttpRequestDecoder.java:75)
    at org.jboss.netty.handler.codec.http.HttpMessageDecoder.decode(HttpMessageDecoder.java:189)
    at org.jboss.netty.handler.codec.http.HttpMessageDecoder.decode(HttpMessageDecoder.java:101)
    at org.jboss.netty.handler.codec.replay.ReplayingDecoder.callDecode(ReplayingDecoder.java:500)
    at org.jboss.netty.handler.codec.replay.ReplayingDecoder.cleanup(ReplayingDecoder.java:554)
    at org.jboss.netty.handler.codec.frame.FrameDecoder.channelDisconnected(FrameDecoder.java:365)
    at org.jboss.netty.channel.Channels.fireChannelDisconnected(Channels.java:396)
    at org.jboss.netty.channel.socket.nio.AbstractNioWorker.close(AbstractNioWorker.java:336)
    at org.jboss.netty.channel.socket.nio.NioServerSocketPipelineSink.handleAcceptedSocket(NioServerSocketPipelineSink.java:81)
    at org.jboss.netty.channel.socket.nio.NioServerSocketPipelineSink.eventSunk(NioServerSocketPipelineSink.java:36)
    at org.jboss.netty.handler.codec.oneone.OneToOneEncoder.handleDownstream(OneToOneEncoder.java:54)
    at org.jboss.netty.handler.stream.ChunkedWriteHandler.handleDownstream(ChunkedWriteHandler.java:109)
    at org.jboss.netty.channel.Channels.close(Channels.java:812)
    at org.jboss.netty.channel.AbstractChannel.close(AbstractChannel.java:197)
    at org.jboss.netty.channel.ChannelFutureListener$1.operationComplete(ChannelFutureListener.java:41)
    at org.jboss.netty.channel.DefaultChannelFuture.notifyListener(DefaultChannelFuture.java:427)
    at org.jboss.netty.channel.DefaultChannelFuture.addListener(DefaultChannelFuture.java:145)
    at org.apache.hadoop.mapred.ShuffleHandler$Shuffle.sendError(ShuffleHandler.java:1238)
    at org.apache.hadoop.mapred.ShuffleHandler$Shuffle.sendError(ShuffleHandler.java:1222)
    at org.apache.hadoop.mapred.ShuffleHandler$Shuffle.exceptionCaught(ShuffleHandler.java:1264)
    at org.jboss.netty.handler.stream.ChunkedWriteHandler.handleUpstream(ChunkedWriteHandler.java:142)
    at org.jboss.netty.handler.codec.frame.FrameDecoder.exceptionCaught(FrameDecoder.java:377)
    at org.jboss.netty.channel.Channels.fireExceptionCaught(Channels.java:525)
    at org.jboss.netty.channel.AbstractChannelSink.exceptionCaught(AbstractChannelSink.java:48)
    at org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:268)
    at org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:255)
    at org.jboss.netty.channel.socket.nio.NioWorker.read(NioWorker.java:88)
    at org.jboss.netty.channel.socket.nio.AbstractNioWorker.process(AbstractNioWorker.java:107)
    at org.jboss.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:312)
    at org.jboss.netty.channel.socket.nio.AbstractNioWorker.run(AbstractNioWorker.java:88)
    at org.jboss.netty.channel.socket.nio.NioWorker.run(NioWorker.java:178)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
    at java.lang.Thread.run(Thread.java:745)

在发生错误之前,会显示许多内存使用错误并看起来很正常:

2018-07-18 17:22:13,972 INFO  monitor.ContainersMonitorImpl (ContainersMonitorImpl.java:run(499)) - Memory usage of ProcessTree 30619 for container-id container_e70_1531921804510_0003_01_000001: 380.1 MB of 4 GB physical memory used; 2.2 GB of 8.4 GB virtual memory used

关于什么可能导致问题的任何想法?以及如何解决?

0 个答案:

没有答案