一段时间以来,我一直在尝试在Spark上运行Hive,在每种情况下,它都会在一个或几个Spark的执行程序上因以下错误而停止运行:
java.lang.NoSuchMethodError: net.jpountz.lz4.LZ4BlockInputStream.(Ljava/io/InputStream;Z)V at
org.apache.spark.io.LZ4CompressionCodec.compressedInputStream(CompressionCodec.scala:122) at
org.apache.spark.broadcast.TorrentBroadcast$$anonfun$6.apply(TorrentBroadcast.scala:304) at
org.apache.spark.broadcast.TorrentBroadcast$$anonfun$6.apply(TorrentBroadcast.scala:304) at
scala.Option.map(Option.scala:146) at org.apache.spark.broadcast.TorrentBroadcast$.unBlockifyObject(TorrentBroadcast.scala:304) at
org.apache.spark.broadcast.TorrentBroadcast$$anonfun$readBroadcastBlock$1$$anonfun$apply$2.apply(TorrentBroadcast.scala:235) at
scala.Option.getOrElse(Option.scala:121) at
org.apache.spark.broadcast.TorrentBroadcast$$anonfun$readBroadcastBlock$1.apply(TorrentBroadcast.scala:211) at
org.apache.spark.util.Utils$.tryOrIOException(Utils.scala:1326) at org.apache.spark.broadcast.TorrentBroadcast.readBroadcastBlock(TorrentBroadcast.scala:207) at
org.apache.spark.broadcast.TorrentBroadcast._value$lzycompute(TorrentBroadcast.scala:66) at
org.apache.spark.broadcast.TorrentBroadcast._value(TorrentBroadcast.scala:66) at
org.apache.spark.broadcast.TorrentBroadcast.getValue(TorrentBroadcast.scala:96) at
org.apache.spark.broadcast.Broadcast.value(Broadcast.scala:70) at
org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:84) at
org.apache.spark.scheduler.Task.run(Task.scala:121) at
org.apache.spark.executor.Executor$TaskRunner$$anonfun$10.apply(Executor.scala:403) at
org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1360) at
org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:409) at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748)
我正在使用Spark 2.4.1(与2.4.0相同的问题)和Hive 2.3.4。纱线2.7.5用作任务管理器。 Hive任务正在Spark上运行,但是每次我使用Hive启动新的sql请求时,都会出现错误。
有什么解决方案可以解决这个问题?
致谢。