我有一个需要更新的大表。它由以下示例定义(但我的是真正庞大的1M +行和更多列)...
CREATE TABLE T
([Errors] varchar(4), [MRN] int, [EPI] varchar(13), [WD] varchar(4));
INSERT INTO T
([Errors], [MRN], [EPI], [WD])
VALUES
(NULL, 107, 'IP00001070001', 'AMUM'),
(NULL, 107, 'IP00001070001', 'AMUM'),
(NULL, 107, 'IP00001070001', 'KNAP'),
(NULL, 107, 'IP00001070002', 'KNAP'),
(NULL, 107, 'IP00001070002', 'KNAP'),
(NULL, 107, 'IP00001070002', 'KNAP'),
(NULL, 107, 'IP00001070002', 'KNAP'),
(NULL, 381, 'IP00003810001', 'EAUS'),
(NULL, 381, 'IP00003810001', 'EAUS'),
(NULL, 381, 'IP00003810003', 'DOCK'),
(NULL, 381, 'IP00003810003', NULL),
(NULL, 45, 'IP00000450001', 'ASES'),
('__', 45, 'IP00000450002', NULL),
('__', 381, 'IP00003810002', NULL);
我需要更新那些具有空WD值的记录的WD列,以匹配第一个条目的WD值(如果它们是由[MRN]和[EPI]排序的话)。例如,所需的输出将是:
Errors MRN EPI WD
NULL 107 IP00001070001 AMUM
NULL 107 IP00001070001 AMUM
NULL 107 IP00001070001 KNAP
NULL 107 IP00001070002 KNAP
NULL 107 IP00001070002 KNAP
NULL 107 IP00001070002 KNAP
NULL 107 IP00001070002 KNAP
NULL 381 IP00003810001 EAUS
NULL 381 IP00003810001 EAUS
NULL 381 IP00003810003 EAUS
NULL 381 IP00003810003 EAUS
NULL 45 IP00000450001 ASES
__ 381 IP00003810003 EAUS
__ 45 IP00000450002 ASES
__ 381 IP00003810002 EAUS
底部编辑的记录。这就是我要的。但是,这个方法是SLLLLOOOOWWW ......很慢,而且有充分的理由,我循环遍历整个集合。我的问题已经将目标表编入索引:
以下是整个测试查询集,以帮助任何愿意提供帮助的人:
IF EXISTS (
SELECT name
FROM sys.tables
WHERE name = N'T')
DROP TABLE [T]
GO
CREATE TABLE T
([Errors] varchar(4), [MRN] int, [EPI] varchar(13), [WD] varchar(4));
INSERT INTO T
([Errors], [MRN], [EPI], [WD])
VALUES
(NULL, 107, 'IP00001070001', 'AMUM'),
(NULL, 107, 'IP00001070001', 'AMUM'),
(NULL, 107, 'IP00001070001', 'KNAP'),
(NULL, 107, 'IP00001070002', 'KNAP'),
(NULL, 107, 'IP00001070002', 'KNAP'),
(NULL, 107, 'IP00001070002', 'KNAP'),
(NULL, 107, 'IP00001070002', 'KNAP'),
(NULL, 381, 'IP00003810001', 'EAUS'),
(NULL, 381, 'IP00003810001', 'EAUS'),
(NULL, 381, 'IP00003810003', 'DOCK'),
(NULL, 381, 'IP00003810003', 'DOCK'),
(NULL, 45, 'IP00000450001', 'ASES'),
('__', 381, 'IP00003810003', NULL),
('__', 45, 'IP00000450002', NULL),
('__', 381, 'IP00003810002', NULL);
IF EXISTS (SELECT *
FROM sys.indexes
WHERE name='idxEETEST' AND object_id = OBJECT_ID('T'))
DROP INDEX [idxEETEST] ON [T];
GO
CREATE NONCLUSTERED INDEX [idxEpiIPWardLoad]
ON [T] ([MRN], [EPI])
GO
DECLARE @sql NVARCHAR(MAX)
DECLARE @mrn INT
DECLARE @epi NVARCHAR(16)
DECLARE @get_rec CURSOR
SET @get_rec = CURSOR FOR
SELECT MRN, EPI
FROM T
WHERE Errors IS NOT NULL
OPEN @get_rec
FETCH NEXT
FROM @get_rec INTO @mrn, @epi
WHILE @@FETCH_STATUS = 0
BEGIN
SET @sql =
'DECLARE @wd VARCHAR(4); ' +
'SELECT TOP 1 @wd = WD ' +
'FROM T ' +
'WHERE MRN = ' + Convert(VARCHAR, @mrn) + ';' +
'UPDATE T ' +
'SET WD = @wd ' +
'WHERE MRN = ' + Convert(VARCHAR, @mrn) + ' AND EPI = ''' + @epi + ''''
EXEC(@sql);
FETCH NEXT
FROM @get_rec INTO @mrn, @epi
END
CLOSE @get_rec
DEALLOCATE @get_rec
GO
IF EXISTS (SELECT *
FROM sys.indexes
WHERE name='idxEETEST' AND object_id = OBJECT_ID('T'))
DROP INDEX [idxEETEST] ON [T];
GO
感谢您的时间。
答案 0 :(得分:0)
我仍然不清楚您的查询。但是,如果您想使用相同的查询,您可以考虑以下改进。
1)为什么要将数据转换为varchar。转换在2个地方完成,如果它被删除,它将提高性能。(不需要希望转换)
2)索引 - 当您基于' MRN'过滤时和EPI',如果每个过滤都有2个索引
i)CREATE INDEX index_name ON table_name(MRN)
ii)CREATE INDEX index_name ON table_name(MRN,EPI)
这也可以提高你的表现
答案 1 :(得分:0)
我想我明白你要做什么。如果你发布了你期望的输出结果,这将有很大帮助,所以我可以确认。但随着规则和数据发布,这应该工作。
update t1 set WD = u.NewWD
FROM T t1
cross apply
(
select top 1 WD as NewWD
from T t2
where t2.MRN = t1.MRN
order by t2.EPI
)u
where t1.Errors is not null