我有一个查询来获取具有一些额外条件的重复数据,但我觉得它不够快。任何使这个查询更快的解决方案?
v_listing
包含重要信息
SELECT DISTINCT code, name, comm, address, area
FROM v_listing t1
WHERE EXISTS (SELECT NULL
FROM v_listing t2
WHERE t1.comm = t2.comm
AND t1.address = t2.address
AND t1.area = t2.area
AND (t1.code > t2.code OR t1.code < t2.code))
ORDER BY comm, address, area
答案 0 :(得分:3)
exists
子句执行半连接,这不是比较两个非常大的表的最佳方法。在这种情况下,它是一个表,但重点是。你想要做的是inner join
:
SELECT DISTINCT
t1.code,
t1.name,
t1.comm,
t1.address,
t1.area
FROM
v_listing t1
inner join v_listing t2 on
t1.comm = t2.comm
AND t1.address = t2.address
AND t1.area = t2.area
AND t1.code <> t2.code
ORDER BY t1.comm, t1.address, t1.area
还要确保所有连接列都有索引。这也将极大地提高速度。
答案 1 :(得分:0)
仅此一项改变应该有很多帮助:
SELECT DISTINCT code, name, comm, address, area
FROM v_listing t1
WHERE EXISTS ( SELECT NULL
FROM v_listing t2
WHERE t1.comm = t2.comm
AND t1.address = t2.address
AND t1.area = t2.area
AND t1.code <> t2.code)
ORDER BY comm, address, area
或者,你可以这样做:
SELECT comm, address, area, MIN(code), MAX(code), MIN(name), COUNT(*)
FROM v_listing t1
GROUP BY comm, address, area
HAVING COUNT(*) > 2
ORDER BY comm, address, area