我正在尝试从非常大的Audits表(数百万行)中检索数据。所以我需要尽可能高效地运行查询。 首先我正在使用子查询返回ObjectTypeId并使用它来限制Audit表上的查询
此查询需要4分钟才能运行:
select distinct Audits.ObjectTypeID, COUNT(*) as Count
from Audits as Audits
where Audits.ObjectTypeID =
(select distinct ObjectType.ObjectTypeID from ObjectType where ObjectName = 'Data')
group by Audits.ObjectTypeID
如果我在ObjectTypeID中默认,则查询在42秒内运行
select distinct(Audits.ObjectTypeID), COUNT(*) as Count
from Audits
where Audits.ObjectTypeID = 1
group by Audits.ObjectTypeID
但是,在隔离运行时子查询只需要一秒钟就可以运行。那么为什么第一个查询需要这么长时间?
答案 0 :(得分:1)
我可以看到三件可能有用的事情:
ObjectTypeID
拉入变量:因为它应该只有一个值DISTINCT
,因为它们应该是不必要的(子查询应该只有一个值,并且您在外部查询中按该值进行分组ObjectTypeID
所以最终的查询是:
DECLARE @ObjectTypeID INT
SELECT @ObjectTypeID = (select ObjectType.ObjectTypeID
from ObjectType
where ObjectName = 'Data')
select Audits.ObjectTypeID, COUNT(*) as Count
from Audits as Audits
where Audits.ObjectTypeID = @ObjectTypeID
如果您将此作为单个语句执行而不是作为批处理或存储过程(意味着您不能使用变量),则可以保留子查询:
select Audits.ObjectTypeID, COUNT(*) as Count
from Audits as Audits
where Audits.ObjectTypeID =
(select ObjectType.ObjectTypeID
from ObjectType
where ObjectName = 'Data')
答案 1 :(得分:1)
您获得最佳效果的部分可能是这一行:
where Audits.ObjectTypeID =
(select distinct ObjectType.ObjectTypeID from ObjectType where ObjectName = 'Data')
实际上,您正在对表的每一行调用相同的查询,它将搜索整个ObjectType
表并返回该子查询的整个结果。如果您的ObjectType
表格很大,这将是一个巨大的性能影响。您可以使用EXISTS
加快查询的该部分,以便在找到结果后提前返回。这是一个例子:
SELECT a.ObjectTypeID, COUNT(*) as Count
FROM Audits a
WHERE EXISTS
(
SELECT ot.ObjectTypeID
FROM ObjectType ot
WHERE ot.ObjectName = 'Data' AND ot.ObjectTypeID = a.ObjectTypeID
)
GROUP BY a.ObjectTypeID
答案 2 :(得分:0)
你能试试吗
SELECT DISTINCT Audits.ObjectTypeID, COUNT(*) as Count
FROM Audits as Audits
INNER JOIN
(SELECT DISTINCT ObjectTypeId, ObjectName FROM ObjectType
WHERE ObjectName = 'Data') as ObjectType ON Audits.ObjectTypeID = ObjectType.ObjectTypeID
GROUP BY Audits.ObjectTypeID