SELECT *
FROM A
WHERE A.key1 NOT IN (
SELECT B.key1
FROM B
)
AND A.key2 NOT IN (
SELECT B.key2
FROM B
)
此查询在Spark中的性能非常差。所以我想替换其他查询。有什么想法吗? (例如左反连接)
答案 0 :(得分:0)
试试这个:
SELECT *
FROM A
WHERE NOT EXISTS (
SELECT 1
FROM B
WHERE B.key1 = A.key1
)
AND NOT EXISTS (
SELECT 1
FROM B
WHERE B.key2 = A.key2
)
答案 1 :(得分:0)
如果您希望通过左反连接来执行以下代码。左反连接将仅保留左表中的记录,而右表没有相应的匹配。它还只显示左表中的列。
sqlContext.sql(
"""SELECT A.* FROM A
| LEFT ANTI JOIN B
| ON A.key1 = B.key1 AND A.key2 = B.key2
""".stripMargin)