Spark-java:线程“ main”中的异常org.apache.spark.sql.AnalysisException

时间:2018-08-06 07:16:15

标签: apache-spark apache-spark-sql

我在下面的查询中可以在 SQL DEVELOPER 上正常运行:

SELECT C.CIS_DIVISION, C.EFFDT AS START_DT, LEAD(EFFDT, 1) OVER(PARTITION BY CIS_DIVISION, CHAR_TYPE_CD ORDER BY CIS_DIVISION, CHAR_TYPE_CD, EFFDT) - 1 AS END_DT , C.CHAR_VAL, C.CHAR_TYPE_CD FROM  CI_CIS_DIV_CHAR C WHERE C.CHAR_TYPE_CD in ('C1-TFMPD','C1-TFMCR') ORDER BY CIS_DIVISION,CHAR_TYPE_CD, EFFDT

SQL开发人员中的输出:

CIS_DIVISION    START_DT    END_DT  CHAR_VAL    CHAR_TYPE_CD
747            01-Jan-10    (null)  BATCH_DT      C1-TFMPD
CAL            01-Jan-16    (null)  BATCH_DT      C1-TFMPD
NYC            01-Jan-90    (null)  BATCH_DT      C1-TFMPD
PERF1          01-Jan-01    (null)  BATCH_DT      C1-TFMPD
PERF2          01-Jan-01    (null)  BATCH_DT      C1-TFMPD
PERF3          01-Jan-01    (null)  BATCH_DT      C1-TFMPD

但是当使用eclipse在spark中运行时,相同的查询会给出错误

代码

private static final String DIVISIONCHARS_QUERY = "SELECT C.CIS_DIVISION, "
               + "C.EFFDT AS START_DT, "
               +"LEAD(EFFDT, 1) OVER(PARTITION BY CIS_DIVISION, " 
               +"CHAR_TYPE_CD ORDER BY CIS_DIVISION, " 
               +"CHAR_TYPE_CD, EFFDT) - 1 AS END_DT , "
               +"C.CHAR_VAL, " 
               +"C.CHAR_TYPE_CD " 
               +"FROM  CI_CIS_DIV_CHAR C " 
               +"WHERE C.CHAR_TYPE_CD in ('C1-TFMPD','C1-TFMCR') ORDER BY CIS_DIVISION,CHAR_TYPE_CD, EFFDT";

  Dataset<Row> divChartable = sparkSession.read().format("jdbc").option("url",connection ).option("dbtable", "CI_CIS_DIV_CHAR").load();

       divChartable.registerTempTable("CI_CIS_DIV_CHAR");

       Dataset<Row> divCharsDS = sparkSession.sql(DIVISIONCHARS_QUERY);

由于我在查询中使用 CI_CIS_DIV_CHAR 表,因此我首先为其创建了临时表,否则将导致找不到表错误。

运行上面的代码时,它给出以下错误:

Exception in thread "main" org.apache.spark.sql.AnalysisException:

 cannot resolve '(lead(C.`EFFDT`, 1, NULL) OVER (PARTITION BY C.`CIS_DIVISION`, C.`CHAR_TYPE_CD` ORDER BY C.`CIS_DIVISION` ASC NULLS FIRST, C.`CHAR_TYPE_CD` ASC NULLS FIRST, C.`EFFDT` ASC NULLS 

 FIRST ROWS BETWEEN 1 FOLLOWING AND 1 FOLLOWING) - 1)' due to data type mismatch: differing types in '(lead(C.`EFFDT`, 1, NULL) OVER (PARTITION BY C.`CIS_DIVISION`, C.`CHAR_TYPE_CD` ORDER BY C.`CIS_DIVISION` 

 ASC NULLS FIRST, C.`CHAR_TYPE_CD` ASC NULLS FIRST, C.`EFFDT` ASC NULLS FIRST ROWS BETWEEN 1 FOLLOWING AND 1 FOLLOWING) - 1)' (timestamp and int).; line 1 pos 44;

 'Sort ['CIS_DIVISION ASC NULLS FIRST, 'CHAR_TYPE_CD ASC NULLS FIRST, 'EFFDT ASC NULLS FIRST], true


+- 'Project [CIS_DIVISION#1424, EFFDT#1426 AS START_DT#1448, (lead(EFFDT#1426, 1, null) windowspecdefinition(CIS_DIVISION#1424, CHAR_TYPE_CD#1425, CIS_DIVISION#1424 ASC NULLS FIRST, CHAR_TYPE_CD#1425 ASC NULLS FIRST, 


EFFDT#1426 ASC NULLS FIRST, specifiedwindowframe(RowFrame, 1, 1)) - 1) AS END_DT#1449, CHAR_VAL#1427, CHAR_TYPE_CD#1425]

   +- Filter CHAR_TYPE_CD#1425 IN (C1-TFMPD,C1-TFMCR)
      +- SubqueryAlias C
         +- SubqueryAlias ci_cis_div_char
            +- Relation[CIS_DIVISION#1424,CHAR_TYPE_CD#1425,EFFDT#1426,CHAR_VAL#1427,VERSION#1428,ADHOC_CHAR_VAL#1429,CHAR_VAL_FK1#1430,CHAR_VAL_FK2#1431,CHAR_VAL_FK3#1432,CHAR_VAL_FK4#1433,CHAR_VAL_FK5#1434,

            ENABLED_FLG#1435] JDBCRelation(CI_CIS_DIV_CHAR) [numPartitions=1]

    at org.apache.spark.sql.catalyst.analysis.package$AnalysisErrorAt.failAnalysis(package.scala:42)
    at org.apache.spark.sql.catalyst.analysis.CheckAnalysis$$anonfun$checkAnalysis$1$$anonfun$apply$2.applyOrElse(CheckAnalysis.scala:93)
    at org.apache.spark.sql.catalyst.analysis.CheckAnalysis$$anonfun$checkAnalysis$1$$anonfun$apply$2.applyOrElse(CheckAnalysis.scala:85)
    at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$transformUp$1.apply(TreeNode.scala:289)
    at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$transformUp$1.apply(TreeNode.scala:289)
    at org.apache.spark.sql.catalyst.trees.CurrentOrigin$.withOrigin(TreeNode.scala:70)
    at org.apache.spark.sql.catalyst.trees.TreeNode.transformUp(TreeNode.scala:288)
    at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$3.apply(TreeNode.scala:286)
    at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$3.apply(TreeNode.scala:286)
    at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$4.apply(TreeNode.scala:306)
    at org.apache.spark.sql.catalyst.trees.TreeNode.mapProductIterator(TreeNode.scala:187)
    at org.apache.spark.sql.catalyst.trees.TreeNode.mapChildren(TreeNode.scala:304)
    at org.apache.spark.sql.catalyst.trees.TreeNode.transformUp(TreeNode.scala:286)
    at org.apache.spark.sql.catalyst.plans.QueryPlan$$anonfun$transformExpressionsUp$1.apply(QueryPlan.scala:95)
    at org.apache.spark.sql.catalyst.plans.QueryPlan$$anonfun$transformExpressionsUp$1.apply(QueryPlan.scala:95)
    at org.apache.spark.sql.catalyst.plans.QueryPlan.transformExpression$1(QueryPlan.scala:106)
    at org.apache.spark.sql.catalyst.plans.QueryPlan.org$apache$spark$sql$catalyst$plans$QueryPlan$$recursiveTransform$1(QueryPlan.scala:116)
    at org.apache.spark.sql.catalyst.plans.QueryPlan$$anonfun$org$apache$spark$sql$catalyst$plans$QueryPlan$$recursiveTransform$1$1.apply(QueryPlan.scala:120)
    at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
    at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
    at scala.collection.immutable.List.foreach(List.scala:381)
    at scala.collection.TraversableLike$class.map(TraversableLike.scala:234)
    at scala.collection.immutable.List.map(List.scala:285)
    at org.apache.spark.sql.catalyst.plans.QueryPlan.org$apache$spark$sql$catalyst$plans$QueryPlan$$recursiveTransform$1(QueryPlan.scala:120)
    at org.apache.spark.sql.catalyst.plans.QueryPlan$$anonfun$1.apply(QueryPlan.scala:125)
    at org.apache.spark.sql.catalyst.trees.TreeNode.mapProductIterator(TreeNode.scala:187)
    at org.apache.spark.sql.catalyst.plans.QueryPlan.mapExpressions(QueryPlan.scala:125)
    at org.apache.spark.sql.catalyst.plans.QueryPlan.transformExpressionsUp(QueryPlan.scala:95)
    at org.apache.spark.sql.catalyst.analysis.CheckAnalysis$$anonfun$checkAnalysis$1.apply(CheckAnalysis.scala:85)
    at org.apache.spark.sql.catalyst.analysis.CheckAnalysis$$anonfun$checkAnalysis$1.apply(CheckAnalysis.scala:80)
    at org.apache.spark.sql.catalyst.trees.TreeNode.foreachUp(TreeNode.scala:127)
    at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$foreachUp$1.apply(TreeNode.scala:126)
    at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$foreachUp$1.apply(TreeNode.scala:126)
    at scala.collection.immutable.List.foreach(List.scala:381)
    at org.apache.spark.sql.catalyst.trees.TreeNode.foreachUp(TreeNode.scala:126)
    at org.apache.spark.sql.catalyst.analysis.CheckAnalysis$class.checkAnalysis(CheckAnalysis.scala:80)
    at org.apache.spark.sql.catalyst.analysis.Analyzer.checkAnalysis(Analyzer.scala:91)
    at org.apache.spark.sql.catalyst.analysis.Analyzer.executeAndCheck(Analyzer.scala:104)
    at org.apache.spark.sql.execution.QueryExecution.analyzed$lzycompute(QueryExecution.scala:57)
    at org.apache.spark.sql.execution.QueryExecution.analyzed(QueryExecution.scala:55)
    at org.apache.spark.sql.execution.QueryExecution.assertAnalyzed(QueryExecution.scala:47)
    at org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:74)
    at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:638)
    at com.sample.Transformation.initializeProductDerivationCache(Transformation.java:189)
    at com.sample.Transformation.main(Transformation.java:114)

oracle关键字不适用于spark sql吗?还是还有其他问题?

使用的火花版本:2.3.0

0 个答案:

没有答案