我有这个需要永远运行的查询。该表包含大约700万行。我正在做的其他事情(它是一个“临时”的永久表)相对较快(一个小时左右),而这一个UPDATE属于7个小时!我们有SQL Server 2014。
DOI
是一个NVARCHAR(72)
,并且上面有一个非唯一的CLUSTERED
索引。 Affiliations
是VARCHAR(8000)
。我真的不允许更改这些数据类型。 Affiliations
有一个索引作为include。由于该领域如此之大,我们无法做出“常规”指数。
CREATE NONCLUSTERED INDEX IX_Affiliations
ON TempSourceTable (DOI) INCLUDE (Affiliations);
如果DOI
的所有记录在Affiliations
列中具有相同的值,则以下语句的作用是将位字段设置为1。此表每DOI
个值有多个记录,我们想知道Affiliations
列对于具有相同DOI
的所有记录是否相同。
有没有什么方法可以加快速度,写一个不同的查询,一个不同的索引,或者我是否会错过这个?
UPDATE S
SET AffiliationsSameForAllDOI = 1
FROM TempSourceTable S
WHERE NOT EXISTS (SELECT 1
FROM TempSourceTable S2
WHERE S2.DOI = S.DOI
AND S2.Affiliations <> S.Affiliations)
答案 0 :(得分:6)
这是另一种方式
SUB-QUERY
版
UPDATE TempSourceTable
SET AffiliationsSameForAllDOI = 1
WHERE doi IN (SELECT doi
FROM TempSourceTable S
GROUP BY DOI
HAVING COUNT(DISTINCT Affiliations) = 1)
EXISTS
版本
UPDATE TempSourceTable S
SET AffiliationsSameForAllDOI = 1
WHERE EXISTS (SELECT 1
FROM TempSourceTable S1
Where s1.DOI = s.DOI
HAVING COUNT(DISTINCT Affiliations) = 1)
INNER JOIN
版本
UPDATE S
SET AffiliationsSameForAllDOI = 1
FROM TempSourceTable S
INNER JOIN (SELECT doi
FROM TempSourceTable
GROUP BY DOI
HAVING COUNT(DISTINCT Affiliations) = 1) S1
ON S.DOI = S1.DOI
答案 1 :(得分:4)
update TempSourceTable
set AffiliationsSameForAllDOI = 1
where DOI in (
select DOI
from TempSourceTable
group by DOI
having count(distinct Affiliations) = 1
)
根据您的数据的样子,您可以通过创建一个计算列来消除性能,这可以说明Affiliations
中的前16个字符或仅使用checksum()
然后编制索引而是在那一列上。也许它看起来像这样:
update TempSourceTable
set AffiliationsSameForAllDOI = 1
where DOI in (
select DOI
from TempSourceTable
where DOI in (
select DOI
from TempSourceTable
group by DOI
having count(distinct AffiliationsChecksum) = 1
)
group by DOI
having count(distinct Affiliations) = 1
)
答案 2 :(得分:0)
我希望这比其他产品表现更好,因为它应该在索引上的单次扫描中执行。此外,最小/最大'技巧'避免了必须收集和维护每个不同的值。
WITH X AS
(
SELECT *,
AffiliationsSameForAllDOI_New =
CASE WHEN MAX(Affiliations) OVER (PARTITION BY DOI)
= MIN(Affiliations) OVER (PARTITION BY DOI)
THEN 1
ELSE 0
END
FROM TempSourceTable
)
UPDATE X
SET AffiliationsSameForAllDOI = AffiliationsSameForAllDOI_New
WHERE AffiliationsSameForAllDOI_New = 1