Deltalake错误-MERGE目标仅支持Delta源

时间:2020-09-10 15:46:12

标签: apache-spark pyspark azure-databricks delta-lake

我正在尝试在三角洲湖泊中实施scd-type-2,但出现以下错误-“ MERGE目标仅支持三角洲源”。

下面是我正在执行的代码段。

MERGE INTO stageviews.employeetarget t
            USING (
                  -- The records from the first select statement, will have both new & updated records
                  SELECT id as mergeKey, src.*
                  FROM stageviews.employeeupdate src
                  UNION ALL
                  -- Identify the updated records & setting the mergeKey to NULL forces these rows to NOT MATCH and be INSERTED into target.
                  SELECT NULL as mergeKey, src.*
                  FROM stageviews.employeeupdate src JOIN stageviews.employeetarget tgt
                  ON src.id = tgt.id 
                  WHERE tgt.ind_flag = "1"
                  AND sha2(src.EmployeeName,256) <> sha2(tgt.EmployeeName ,256)
                  ) as s
ON t.id = s.mergeKey
WHEN MATCHED AND 
  ( t.ind_flag = "1" AND sha2(t.EmployeeName,256) <> sha2(s.EmployeeName ,256) ) THEN  
  UPDATE SET t.ind_flag = "0", t.eff_end_date = current_date()-1
WHEN NOT MATCHED THEN 
  INSERT(t.Id,t.EmployeeName,t.JobTitle,t.BasePay,t.OvertimePay,t.OtherPay,t.Benefits,t.TotalPay,t.TotalPayBenefits,t.Year,t.Notes,t.Agency,t.Status,t.ind_flag,t.create_date,t.update_date,t.eff_start_date,t.eff_end_date)
  values(s.Id,s.EmployeeName,s.JobTitle,s.BasePay,s.OvertimePay,s.OtherPay,s.Benefits,s.TotalPay,s.TotalPayBenefits,s.Year,s.Notes,s.Agency,s.Status,s.ind_flag,
  current_date(),current_date(),current_date(),to_date('9999-12-31'))

1 个答案:

答案 0 :(得分:1)

不幸的是,Databricks仅支持对delta(delta lake)表的更新。

错误消息SQL语句中的错误:AnalysisException:MERGE目标仅支持Delta源,指示您尝试在非增量表上进行更新。

将基于源表的一组更新,插入和删除合并到目标Delta表中。

参考: Azure Databricks - MergeSCD Type 2 using Merge