Spark流式保存数据帧失败

时间:2017-09-28 05:48:26

标签: spark-streaming amazon-redshift apache-spark-2.0 snappy

我正在使用Spark 2.2在AWS集群上写入Redshift,但由于出现以下错误而失败。我正在使用CDH 5.10和scala 2.11.8。有想法该怎么解决这个吗?是否缺少快速依赖?

 WARN  TaskSetManager:66 - Lost task 0.0 in stage 3.0  org.apache.spark.SparkException: Task failed while writing rows
        at org.apache.spark.sql.execution.datasources.FileFormatWriter$.org$apache$spark$sql$execution$datasources$FileFormatWriter$$executeTask(FileFormatWriter.scala:272)
        at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$write$1$$anonfun$apply$mcV$sp$1.apply(FileFormatWriter.scala:191)
        at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$write$1$$anonfun$apply$mcV$sp$1.apply(FileFormatWriter.scala:190)
        at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87)
        at org.apache.spark.scheduler.Task.run(Task.scala:108)
        at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:335)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:748)
    Caused by: java.lang.UnsatisfiedLinkError: org.xerial.snappy.SnappyNative.maxCompressedLength(I)I
        at org.xerial.snappy.SnappyNative.maxCompressedLength(Native Method)
        at org.xerial.snappy.Snappy.maxCompressedLength(Snappy.java:316)
        at org.apache.avro.file.SnappyCodec.compress(SnappyCodec.java:43)
        at org.apache.avro.file.DataFileStream$DataBlock.compressUsing(DataFileStream.java:361)
        at org.apache.avro.file.DataFileWriter.writeBlock(DataFileWriter.java:394)
        at org.apache.avro.file.DataFileWriter.sync(DataFileWriter.java:413)
        at org.apache.avro.file.DataFileWriter.flush(DataFileWriter.java:422)
        at org.apache.avro.file.DataFileWriter.close(DataFileWriter.java:445)
        at org.apache.avro.mapreduce.AvroKeyRecordWriter.close(AvroKeyRecordWriter.java:83)
        at com.databricks.spark.avro.AvroOutputWriter.close(AvroOutputWriter.scala:84)
        at org.apache.spark.sql.execution.datasources.FileFormatWriter$SingleDirectoryWriteTask.releaseResources(FileFormatWriter.scala:337)
        at org.apache.spark.sql.execution.datasources.FileFormatWriter$SingleDirectoryWriteTask.execute(FileFormatWriter.scala:330)
        at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$org$apache$spark$sql$execution$datasources$FileFormatWriter$$executeTask$3.apply(FileFormatWriter.scala:258)
        at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$org$apache$spark$sql$execution$datasources$FileFormatWriter$$executeTask$3.apply(FileFormatWriter.scala:256)
        at org.apache.spark.util.Utils$.tryWithSafeFinallyAndFailureCallbacks(Utils.scala:1375)
        at org.apache.spark.sql.execution.datasources.FileFormatWriter$.org$apache$spark$sql$execution$datasources$FileFormatWriter$$executeTask(FileFormatWriter.scala:261)

0 个答案:

没有答案