I'm trying to do lots of joins on some data frames using spark in scala
. When I'm trying to get the count of the final data frame I'm generating here, I'm getting the following exception. I'm running the code using spark-shell.
I've tried some configuration params like following while starting the spark-shell. But none of them worked. Is there anything I'm missing here? :
--conf "spark.driver.extraLibraryPath=/usr/hdp/2.6.3.0-235/hadoop/lib/native/"
--jars /usr/hdp/current/hadoop-client/lib/snappy-java-1.0.4.1.jar
Caused by: java.lang.RuntimeException: native snappy library not available: this version of libhadoop was built without snappy support. at org.apache.hadoop.io.compress.SnappyCodec.checkNativeCodeLoaded(SnappyCodec.java:65) at org.apache.hadoop.io.compress.SnappyCodec.getDecompressorType(SnappyCodec.java:193)
答案 0 :(得分:0)
尝试将Hadoop jar文件从2.6.3.
更新为2.8.0
或3.0.0
。较早版本的Hadoop中有the bug:本机快照库不可用。
修改Hadoop核心jar之后,您应该能够执行快速的压缩/解压缩。