我正在尝试以拼花格式存储的Hive(好吧,它是Impala)中读取一个表格。我使用Spark 1.3.0和HiveContext。
表的架构是:
(a,DoubleType)
(b,DoubleType)
(c,IntegerType)
(d,StringType)
(e,DecimalType(18,0))
我的代码是:
val sc = new SparkContext(conf)
val hc = new HiveContext(sc)
import hc.implicits._
import hc.sql
val df: DataFrame = hc.table(mytable)
跟踪日志错误是:
16/03/31 11:33:34 WARN scheduler.TaskSetManager: Lost task 0.0 in stage 0.0 (TID 0, cloudera-smm-2.desa.taiif.aeat): java.lang.ClassCastException: scala.runtime.BoxedUnit cannot be cast to org.apache.spark.sql.types.Decimal
at org.apache.spark.sql.types.Decimal$DecimalIsFractional$.toDouble(Decimal.scala:330)
at org.apache.spark.sql.catalyst.expressions.Cast$$anonfun$castToDouble$5.apply(Cast.scala:361)
at org.apache.spark.sql.catalyst.expressions.Cast$$anonfun$castToDouble$5.apply(Cast.scala:361)
at org.apache.spark.sql.catalyst.expressions.Cast.eval(Cast.scala:426)
at org.apache.spark.sql.catalyst.expressions.Alias.eval(namedExpressions.scala:105)
at org.apache.spark.sql.catalyst.expressions.InterpretedMutableProjection.apply(Projection.scala:68)
at org.apache.spark.sql.catalyst.expressions.InterpretedMutableProjection.apply(Projection.scala:52)
at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
...
似乎没有正确转换Decimal格式。有什么想法吗?
答案 0 :(得分:0)
问题在于SparkSQL使用其内置的Metastore,而不是在Hive中使用现有的Metastore。
您应将此属性设置为 false :
hc.setConf("spark.sql.hive.convertMetastoreParquet", "false")