有两个表,一个 empl 有 545405 记录,第二个 pam 有 1466320 记录。任务是根据 aID 查找 pID 的计数。因此,为了完成这项任务,我编写了以下查询。
Select pa.aID, count(pa.pID) from
empl join pam pa
ON empl.pID = pa.pID
Group by pa.aID
pam 的索引如下:
IX_pam_Unique nonclustered, unique, unique key located on PRIMARY pID, aID
IX_pam_aID nonclustered located on PRIMARY aID
PK_paID clustered, unique, primary key located on PRIMARY paID
“实际执行”计划显示“索引扫描”:
我能理解的是,估计数据大小 15 MB 导致问题。
有没有办法根据大量数据调整此复杂计数查询?
修改
使用empl过滤器的查询:
Select pa.aID, count(pa.pID) from
empl join pam pa
ON empl.pID = pa.pID
where
empl.del = 0 AND
empl.pub = 1 AND
empl.sID = 2 AND
empl.md = 0
Group by pa.aID
结构中没有任何花哨的东西,只使用基本数据类型int,bit,varchar和datatime。 empl 中的 65 列和 pam中的 4 列
答案 0 :(得分:1)
这可能对你有所帮助 -
SELECT pa.*
FROM empl
JOIN (
SELECT
pa.aID
, cnt = COUNT(pa.pid)
FROM pam pa
GROUP BY pa.aID
) pa ON empl.pid = pa.pid
或者这个 -
SELECT pa.aID, COUNT(pa.pid)
FROM pam pa
WHERE EXISTS(
SELECT 1
FROM empl
WHERE empl.pid = pa.pid
)
GROUP BY pa.aID
甚至是这个 -
SELECT
pa.aID
, cnt = COUNT(pa.pid)
FROM pam pa
GROUP BY pa.aID
答案 1 :(得分:1)
保持查询不变,在empl
上添加一个索引,其中只包含del,pub,sid,md和pid列。确保pid是索引中的最后一列。
编辑: 尝试的替代查询可能是
SELECT DISTINCT pa.aID, COUNT(pa.pID) OVER (PARTITION BY pa.aID) AS cnt
FROM empl JOIN pam pa
ON empl.pID = pa.pID
WHERE
empl.del = 0 AND
empl.pub = 1 AND
empl.sID = 2 AND
empl.md = 0
注意这个不需要GROUP BY
。不确定它会更快/更慢。查询计划将与GROUP BY
的查询计划不同。
编辑:
你是对的。我添加了DISTINCT
答案 2 :(得分:0)
对于静态条件,您可以尝试索引视图:
create view vPamCnt
with schemabinding
as
Select pa.aID, count_big(*) cnt
from dbo.pam pa
join dbo.empl empl ON empl.pID = pa.pID
where
empl.del = 0 AND
empl.pub = 1 AND
empl.sID = 2 AND
empl.md = 0
Group by pa.aID
GO
create unique clustered index CI_vPamCnt on vPamCnt (aID)
GO
并将您的查询更改为:
select aID, cast(cnt as int)
from vPamCnt