我的SQL循环需要很长时间才能完成。有没有更好的方法去做我想要实现的目标?

时间:2016-06-14 05:55:32

标签: sql sql-server tsql while-loop

我正在尝试根据某些列重复删除一组数据。这不仅仅是SELECT DISTINCT

我想从我的集合中选择列唯一的所有行。我已经按照一种方式对我的集合进行了排序,我只想让循环抓住第一次出现的"代理"关键栏目。我说代理,因为它不是表格的实际主键。

我使用while循环并根据临时表中的行数使用计数器变量。我在处理后删除了临时表中的每一行,所以这应该减少基表的处理记录和任何重复的行。

虽然我的代码有效,但看起来似乎是牛仔'并希望你的意见如何做到更清洁'感谢

这是我的代码:

declare @cnt int

set @cnt = (select COUNT(*) from #temp)

while @cnt > 0
begin       
    select top 1 * into #temp2 from #temp

    insert into #temp3 (Member_ID, email, meeting_status,member_type,firstname,lastname,address1, Match_Method, Match_Score)
        select #temp2.* 
        from #temp2
        left outer join #temp3 on #temp2.Member_ID = #temp3.Member_ID
        where #temp3.Member_ID is null

    delete #temp 
    from #temp
    inner join #temp2 on #temp.Member_ID = #temp2.Member_ID

    drop table #temp2

    set @cnt = (select COUNT(*) from #temp)
end

3 个答案:

答案 0 :(得分:1)

如果我正确理解您的要求,您希望将#temp中的行插入#temp3,但您希望确保只插入来自同一Member_ID的一行。在这种情况下,您可以使用ROW_NUMBER并过滤ROW_NUMBER = 1以确保只插入来自重复Member_ID的一行。然后添加NOT EXISTS过滤器以避免插入已存在的行:

INSERT INTO #temp3 (Member_ID, email, meeting_status, member_type, firstname, lastname, address1, Match_Method, Match_Score)
    SELECT
        Member_ID,
        email,
        meeting_status,
        member_type,
        firstname,
        lastname,
        address1,
        Match_Method,
        Match_Score
    FROM (
        SELECT *,
            Rn = ROW_NUMBER() OVER (PARTITION BY Member_ID ORDER BY (SELECT NULL))
        FROM #temp
    ) t
    WHERE
        t.Rn = 1
        AND NOT EXISTS (
            SELECT 1
            FROM #temp3 t3
            WHERE t3.Member_ID = t.MEMBER_ID
        )

答案 1 :(得分:1)

选项1:

使用以下代码:

WITH uniqueRecords AS(
    SELECT  *,ROW_NUMBER()OVER(PARTITION BY T.Member_ID ORDER BY (SELECT 1)) AS RowNum
    FROM #Temp AS T
)
INSERT INTO #temp3(Member_ID, email, meeting_status,member_type,firstname,lastname,address1, Match_Method, Match_Score)
SELECT U.Member_ID, U.email, U.meeting_status,U.member_type,U.firstname,U.lastname,U.address1, U.Match_Method, U.Match_Score
FROM uniqueRecords AS U
LEFT OUTER JOIN #temp3 T3 on U.Member_ID = T3.Member_ID
WHERE U.RowNum=1
AND T3.Member_ID is null;

选项2:

i)在#temp3 ON Member_ID列上创建UNIQUE INDEX WITH IGNORE_DUP_KEY = ON

CREATE UNIQUE INDEX UX_temp3 ON #temp3 (Member_ID) WITH (IGNORE_DUP_KEY=ON); 

ii)从left join of #temp and #temp3插入结果。 IGNORE_DUP_KEY选项

将忽略重复项
INSERT INTO #temp3(Member_ID, email, meeting_status,member_type,firstname,lastname,address1, Match_Method, Match_Score)
SELECT T.Member_ID, T.email, T.meeting_status,T.member_type,T.firstname,T.lastname,T.address1, T.Match_Method, T.Match_Score
FROM #temp AS T
LEFT OUTER JOIN #temp3 T3 on T.Member_ID = T3.Member_ID
WHERE T3.Member_ID is null;

答案 2 :(得分:0)

  1. 您不必每次在循环中重新计算您的表。您可以简单地减少@cnt
  2. 每次循环时都不应重新创建#temp2。你可以重复使用它
  3. 您根本不必使用#temp2。您可以使用select top 1 * from #temp代替
  4. 您根本不必使用循环,只需使用以下脚本:

    insert into #temp3(Member_ID, email, meeting_status, member_type, firstname, lastname, address1, Match_Method, Match_Score)
        select * 
        from #temp
        where Member_ID not in (select Member_ID from #temp3)