Spark SQL会一直检查所有分区的信息,甚至指定分区吗?

时间:2018-10-18 14:41:39

标签: apache-spark apache-spark-sql

我使用特定分区执行SQL查询:

spark-sql --conf spark.sql.hive.convertMetastoreOrc=true \
 -e "select * from default.s_shouq_user where dt='2018-10-17' limit 10"

我得到了这样的异常(dt=2015-12-22 is表的第一个分区):

Java.io.IOException: Malformed ORC file hdfs://jilian/hai/bo/dw/default.db/s_shouq_user/dt=2015-12-22/000005_0. Invalid postscript.
at org.apache.hadoop.hive.ql.io.orc.ReaderImpl.ensureOrcFooter(ReaderImpl.java:250)
at org.apache.hadoop.hive.ql.io.orc.ReaderImpl.extractMetaInfoFromFooter(ReaderImpl.java:

1 个答案:

答案 0 :(得分:0)

此问题已在v2.3.1中修复