Question

I'm trying to do lots of joins on some data frames using spark in scala. When I'm trying to get the count of the final data frame I'm generating here, I'm getting the following exception. I'm running the code using spark-shell.

I've tried some configuration params like following while starting the spark-shell. But none of them worked. Is there anything I'm missing here? :

--conf "spark.driver.extraLibraryPath=/usr/hdp/2.6.3.0-235/hadoop/lib/native/"
--jars /usr/hdp/current/hadoop-client/lib/snappy-java-1.0.4.1.jar

Caused by: java.lang.RuntimeException: native snappy library not available: this version of libhadoop was built without snappy support. at org.apache.hadoop.io.compress.SnappyCodec.checkNativeCodeLoaded(SnappyCodec.java:65) at org.apache.hadoop.io.compress.SnappyCodec.getDecompressorType(SnappyCodec.java:193)

Answer 1

尝试将Hadoop jar文件从2.6.3.更新为2.8.0或3.0.0。较早版本的Hadoop中有the bug：本机快照库不可用。修改Hadoop核心jar之后，您应该能够执行快速的压缩/解压缩。

Native snappy library not available

1 个答案: