SQL语句中的Databricks错误:AnalysisException:给定输入列无法解析“''

时间:2018-12-23 16:27:15

标签: sql pyspark databricks

我不确定我是否在这个问题的正确小组中。 我在Databricks中创建了以下sql代码,但是我收到了错误消息;

  

SQL语句中的错误:AnalysisException:无法解决   给定输入列的“ a.COUNTRY_ID”:   [a。“ PK_LOYALTYACCOUNT”;“ COUNTRY_ID”;“ CDC_TYPE”,   b。“ PK_LOYALTYACCOUNT”;“ COUNTRY_ID”;“ CDC_TYPE”];第7行pos 7;

我知道代码可以正常运行,因为我已经在SQL Server上成功运行了代码 代码如下:

tabled = spark.read.csv("adl://carlslake.azuredatalakestore.net/testfolder/dbo_tabled.csv",inferSchema=True,header=True)
tablee = spark.read.csv("adl://carlslake.azuredatalakestore.net/testfolder/dbo_tablee.csv",inferSchema=True,header=True)
tabled.createOrReplaceTempView('tabled') 
tablee.createOrReplaceTempView('tablee')
%sql
; with cmn as 
  ( SELECT a.CDC_TYPE,
           a. PK_LOYALTYACCOUNT, --Add these also in CTE result set 
           a.COUNTRY_ID --Add these also in CTE result set 
    FROM  tabled  a 
    INNER JOIN tablee b 
    ON a.COUNTRY_ID = b.COUNTRY_ID 
    AND a.PK_LOYALTYACCOUNT = b.PK_LOYALTYACCOUNT 
    AND a.CDC_TYPE = 'U'
    )
 SELECT 1 AS is_deleted, 
        a.* 
 FROM  tabled  a 
 INNER JOIN cmn 
 ON a.CDC_TYPE = cmn.CDC_TYPE 
 and  a.COUNTRY_ID = cmn.COUNTRY_ID 
 AND a.PK_LOYALTYACCOUNT = cmn.PK_LOYALTYACCOUNT
 UNION ALL 
 SELECT 0 AS is_deleted, 
        b.* 
 FROM tablee  b 
 INNER JOIN cmn 
 ON b.CDC_TYPE = cmn.CDC_TYPE 
 and b.COUNTRY_ID = cmn.COUNTRY_ID 
 AND b.PK_LOYALTYACCOUNT = cmn.PK_LOYALTYACCOUNT
UNION ALL 
SELECT NULL, 
       a.* 
FROM   tabled a 
WHERE  a.CDC_TYPE = 'N' 
UNION ALL 
SELECT NULL, 
       b.* 
FROM   tablee b 
WHERE  b.CDC_TYPE = 'N'

当我运行简单查询时...

example1 =

spark.sql("""select * from tablee""") 

或example2 =

spark.sql("""select * from tabled""") 

我得到以下输出,所以我知道表在那里

output

任何建议都会受到欢迎。

2 个答案:

答案 0 :(得分:0)

由于使用的定界符是分号(;),并且作业正在寻找逗号,因此无法正确识别列。问题解决了

答案 1 :(得分:0)

从csv读取时使用分号分隔符

tabled = spark.read.option("delimiter", ";").csv("adl://carlslake.azuredatalakestore.net/testfolder/dbo_tabled.csv",inferSchema=True,header=True)

tabled = spark.read.load("adl://carlslake.azuredatalakestore.net/testfolder/dbo_tabled.csv",
                 format="csv", sep=";", inferSchema="true", header="true")

参考: https://spark.apache.org/docs/2.3.0/sql-programming-guide.html#manually-specifying-options