火花错误 - 十进制精度39超过最大精度38

时间:2017-05-23 09:08:43

标签: r oracle apache-spark rstudio

当我尝试从Spark数据帧收集数据时,我收到一条错误说明

  

" java.lang.IllegalArgumentException:要求失败:十进制   精度39超过最大精度38"。

Spark数据帧中的所有数据都来自Oracle数据库,我相信小数精度<38。有没有办法在不修改数据的情况下实现这一目标?

# Load required table into memory from Oracle database
df <- loadDF(sqlContext, source = "jdbc", url = "jdbc:oracle:thin:usr/pass@url.com:1521" , dbtable = "TBL_NM")

RawData <- df %>% 
    filter(DT_Column > DATE(‘2015-01-01’))

RawData <- as.data.frame(RawData)

给出错误

下面是stacktrace:

  

WARN TaskSetManager:阶段0.0中丢失的任务1.0(TID 1,10。。***,   executor 0):java.lang.IllegalArgumentException:要求失败:   十进制精度39超过最大精度38 at   scala.Predef $ .require(Predef.scala:224)at   org.apache.spark.sql.types.Decimal.set(Decimal.scala:113)at   org.apache.spark.sql.types.Decimal $ .apply(Decimal.scala:426)at at   org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils $$ anonfun $ $组织阿帕奇$火花$ SQL $执行$ $的数据源JDBC $ JdbcUtils $$ makeGetter $ 3 $$ anonfun $ 9.apply(JdbcUtils.scala:337 )   在   org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils $$ anonfun $ $组织阿帕奇$火花$ SQL $执行$ $的数据源JDBC $ JdbcUtils $$ makeGetter $ 3 $$ anonfun $ 9.apply(JdbcUtils.scala:337 )   在   org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils $ .ORG $阿帕奇$火花$ SQL $执行$ $的数据源JDBC $ JdbcUtils $$ nullSafeConvert(JdbcUtils.scala:438)   在   org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils $$ anonfun $ $组织阿帕奇$火花$ SQL $执行$ $的数据源JDBC $ JdbcUtils $$ makeGetter $ 3.apply(JdbcUtils.scala:337)   在   org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils $$ anonfun $ $组织阿帕奇$火花$ SQL $执行$ $的数据源JDBC $ JdbcUtils $$ makeGetter $ 3.apply(JdbcUtils.scala:335)   在   org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils $$匿名$ 1.getNext(JdbcUtils.scala:286)   在   org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils $$匿名$ 1.getNext(JdbcUtils.scala:268)   在org.apache.spark.util.NextIterator.hasNext(NextIterator.scala:73)   在   org.apache.spark.util.CompletionIterator.hasNext(CompletionIterator.scala:32)   在   org.apache.spark.sql.catalyst.expressions.GeneratedClass $ GeneratedIterator.processNext(未知   来源)at   org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43)   在   org.apache.spark.sql.execution.WholeStageCodegenExec $$ anonfun $ 8 $$不久$ 1.hasNext(WholeStageCodegenExec.scala:377)   在   org.apache.spark.sql.execution.SparkPlan $$ anonfun $ 2.适用(SparkPlan.scala:231)   在   org.apache.spark.sql.execution.SparkPlan $$ anonfun $ 2.适用(SparkPlan.scala:225)   在   org.apache.spark.rdd.RDD $$ anonfun $ mapPartitionsInternal $ 1 $$ anonfun $ $申请25.apply(RDD.scala:826)   在   org.apache.spark.rdd.RDD $$ anonfun $ mapPartitionsInternal $ 1 $$ anonfun $ $申请25.apply(RDD.scala:826)   在   org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)   在org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323)at at   org.apache.spark.rdd.RDD.iterator(RDD.scala:287)at   org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87)at at   org.apache.spark.scheduler.Task.run(Task.scala:99)at   org.apache.spark.executor.Executor $ TaskRunner.run(Executor.scala:282)   在   java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)   在   java.util.concurrent.ThreadPoolExecutor中的$ Worker.run(ThreadPoolExecutor.java:617)   在java.lang.Thread.run(Thread.java:745)

请建议任何解决方案。谢谢。

1 个答案:

答案 0 :(得分:0)