我试图复制我在MSSQL中编写的代码并将其转换为PySpark。我是PySpark的菜鸟。
查询包含内部联接,嵌入式case语句和一堆where语句进行过滤。
SELECT Table1.Part, Table1.Serial, Table1.AIRCRAFT_NUMBER, Table1.date_removed,
Table2.dbo.E15.TIME, Table2.dbo.E15.TSO, data.dbo.EE18.Allowable_Time,
CASE WHEN (data.dbo.EE18.Allowable_Time > 0)
THEN data.dbo.EE18.Allowable_Time - Table2.dbo.E15.TSO END AS CAL
FROM Table1 INNER JOIN
Table2.dbo.E15 ON Table1.SEQ_ID = Table2.dbo.E15.SEQ_ID AND
Table1.Part = Table2.dbo.E15.Part AND
Table1.Serial = Table2.dbo.E15.Serial AND
Table1.DATE_REMOVED_DESCENDING = Table2.dbo.E15.DATE_REMOVED_DESCENDING INNER JOIN
data.dbo.EE18 ON Table2.dbo.E15.Part = data.dbo.EE18.PART_NUMBER AND
Table2.dbo.E15.TIME = data.dbo.EE18.TIME
WHERE (Table1.Part LIKE '18%') AND (Table2.dbo.E15.TIME = 'I') AND
(data.dbo.EE18.Allowable_Time > 0) AND (Table2.dbo.E15.TSO <= 2) OR
(Table1.Part LIKE '18%') AND (Table2.dbo.E15.TIME = 'T') AND
(data.dbo.EE18.Allowable_Time > 0) AND (Table2.dbo.E15.TSO <= 20) OR
(Table1.Part LIKE '18%') AND (Table2.dbo.E15.TIME = 'L') AND
(data.dbo.EE18.Allowable_Time > 0) AND (Table2.dbo.E15.TSO <= 8)
ORDER BY Table1.date_removed DESC
上述查询在PySpark代码中的含义是什么?非常感谢任何帮助:)
答案 0 :(得分:0)
这不是您问题的真正答案,而是展示了您的查询在某些格式设置中看起来更清晰。我还重新设计了谓词的位置,以避免冗余并用括号修复逻辑问题。
SELECT Table1.Part
, Table1.Serial
, Table1.AIRCRAFT_NUMBER
, Table1.date_removed
, Table2.dbo.E15.TIME
, Table2.dbo.E15.TSO
, data.dbo.EE18.Allowable_Time
, CASE WHEN (data.dbo.EE18.Allowable_Time > 0) THEN data.dbo.EE18.Allowable_Time - Table2.dbo.E15.TSO END AS CAL
FROM Table1 t1
INNER JOIN Table2.dbo.E15 ON Table1.SEQ_ID = Table2.dbo.E15.SEQ_ID
AND Table1.Part = Table2.dbo.E15.Part
AND Table1.Serial = Table2.dbo.E15.Serial
AND Table1.DATE_REMOVED_DESCENDING = Table2.dbo.E15.DATE_REMOVED_DESCENDING
INNER JOIN data.dbo.EE18 ON Table2.dbo.E15.Part = data.dbo.EE18.PART_NUMBER
AND Table2.dbo.E15.TIME = data.dbo.EE18.TIME
WHERE Table1.Part LIKE '18%'
AND data.dbo.EE18.Allowable_Time > 0
AND
(
Table2.dbo.E15.TIME = 'I'
AND
Table2.dbo.E15.TSO <= 2
)
OR
(
Table2.dbo.E15.TIME = 'T'
AND
Table2.dbo.E15.TSO <= 20
)
OR
(
Table2.dbo.E15.TIME = 'L'
AND
Table2.dbo.E15.TSO <= 8
)
ORDER BY Table1.date_removed DESC