假设我有这个,产生50,000行:
SELECT photoID FROM photoSearch WHERE photoID BETWEEN 1 AND 50000;
我打算针对那些刚刚返回的photoID运行此查询。
SELECT COUNT(people) AS totalPeople, people
FROM people
INNER JOIN photopeople ON photoPeople.peopleID = people.PeopleID
WHERE photoid IN ('ID's from results')
GROUP BY people
ORDER BY totalPeople DESC
但我从其他人和资源中了解到,IN子句表现不佳,特别是因为我可以拥有100,000多个photoID。
将topID查询中的photoID存储在另一个表(resultsTbl)或非常长的字符串中是一个好主意吗?如果是,或者使用连接或子选择来查询这些ID(在底部查询中),而不是使用IN?或者......还有另一种方法可以保持绩效吗?
将非常感激地收到任何帮助。
答案 0 :(得分:13)
将顶层查询中的photoID存储在另一个表格(resultsTbl)或非常长的字符串中是一个好主意吗?
在另一张表中:一般来说,没有。如果有很多ID并且您在其他地方执行顶级查询,那么将其存储在缓存表中可能没问题。虽然,对于这种情况,“最高查询”很可能会保留在内存中,因此您应该使用子选择。
在一个非常长的字符串中:否。字符串操作通常是高度CPU密集的。
如果是,或者使用连接或子选择来查询这些ID(在底部查询中),而不是使用IN?
IN(select * from foo)
。SELECT count(people) AS totalPeople
, people
FROM people
INNER JOIN photopeople ON photoPeople.peopleID = people.PeopleID
WHERE photoid IN (select photoID
from photoSearch
where photoID
between 1 AND 50000)
GROUP BY people
ORDER BY totalPeople DESC
SELECT count(people) AS totalPeople
, people
FROM people
INNER JOIN photopeople ON photoPeople.peopleID = people.PeopleID
INNER JOIN photoSearch ON photopeople.photoid = photoSearch.photoID
WHERE photoID between 1 AND 50000
GROUP BY people
ORDER BY totalPeople DESC