查找具有重复字段的集合并将它们合并到同一个表中

时间:2014-04-11 08:45:20

标签: c# sql sql-server sql-server-2008 tsql

我确信那里可能有答案,但我现在已经找了一个月,但却找不到怎么做。

我有一个翻译表,包括:ID,Resource_Object,Resource_Name,Resource_Value,Culture_name,idMinCopy。

当页面被更改时,新对象会自动出现,并且函数会检查另一个对象是否具有相同的翻译以将其复制到新对象。

我需要的是一个SQL,它允许我检查现有对象,并在Culture_name为'ES','EN'或'Neutral'

时复制具有相同Resource_Value的对象之间的语言

Neutral是一个页内ID,对于每个culture_name都是唯一的,与'EN'相同或者与'ES'相同。是唯一保证存在的Culture_Name,因为它是自动生成的。

我已经在测试表中添加了一个sql小提琴:http://sqlfiddle.com/#!3/d1f52/1

示例:

ID  Resource_Object  Resource_Name  Resource_Value Culture_Name idMinCopy
--------------------------------------------------------------------------
1  |home.aspx       |label1.text   |Blue          |Neutral     |Null
2  |home.aspx       |label1.text   |Blue          |EN          |Null
3  |home.aspx       |label1.text   |Azul          |ES          |Null
4  |home.aspx       |label1.text   |Bleu          |FR          |Null
5  |page1.aspx      |lblColor.text |Azul          |Neutral     |Null
6  |page1.aspx      |lblColor.text |Blue          |EN          |Null
7  |page1.aspx      |lblColor.text |Azlu          |ES          |Null
8  |page1.aspx      |lblColor.text |Blau          |CAT         |Null
9  |page1.aspx      |lblTitle.text |Color         |Neutral     |Null
10 |page1.aspx      |lblTitle.text |Color         |ES          |Null
11 |page1.aspx      |lblTitle.text |Colour        |EN         |Null

这是一个小样本,我们可以有许多重复和不同的语言。应该占优势的值始终是具有较低[id]的值。 由于任何对象的唯一保证现有[culture_name]是'中性',我所做的是比较'中性'[Resource_Value]与所有其他'中性','ES'和'EN'[Resource_Value] 我设法在第一个对象中获取'CAT'值但不确定如何将'FR'放在第二个对象中。

期望的结果:

ID  Resource_Object  Resource_Name  Resource_Value Culture_Name idMinCopy
--------------------------------------------------------------------------
1  |home.aspx       |label1.text   |Blue          |Neutral     |1
2  |home.aspx       |label1.text   |Blue          |EN          |2
3  |home.aspx       |label1.text   |Azul          |ES          |3
4  |home.aspx       |label1.text   |Bleu          |FR          |4
5  |page1.aspx      |lblColor.text |Blue          |Neutral     |1
6  |page1.aspx      |lblColor.text |Blue          |EN          |2
7  |page1.aspx      |lblColor.text |Azul          |ES          |3
8  |page1.aspx      |lblColor.text |Blau          |CAT         |12
9  |page1.aspx      |lblTitle.text |Color         |Neutral     |Null
10 |page1.aspx      |lblTitle.text |Color         |ES          |Null
11 |page1.aspx      |lblTitle.text |Colour        |EN          |Null
12 |home.aspx       |label1.text   |Blau          |CAT         |12
13 |page1.aspx      |lblColor.text |Bleu          |FR          |4

我想在[idMinCopy]字段中输入最低相同对象的ID。稍后,如果在任何“Resource_value”中进行了更改,我可以使用相同的“idMinCopy”轻松更新所有内容。

我想我需要一个递归的cte和一个Merge语句才能实现这个目标,但我无法让它工作,也不知道还能做什么。

我检查过的资源可能是相关的(显示为代码,因为我没有代表):

http://stackoverflow.com/questions/21216534/sql-query-result-from-multiple-tables-without-duplicates

http://stackoverflow.com/questions/12910287/merging-tables-with-duplicate-data

http://stackoverflow.com/questions/14274942/sql-server-cte-and-recursion-example

http://www.codeproject.com/Articles/683011/How-to-use-recursive-CTE-calls-in-T-SQL

http://stackoverflow.com/questions/1222581/how-get-the-t-sql-code-to-find-duplicates

http://pratchev.blogspot.com.es/2008/03/upsert-and-more-with-merge.html

http://www.sergeyv.com/blog/archive/2010/09/10/sql-server-upsert-equivalent.aspx

我正在使用的sql:

--- Search for similar 'Resource_Values' 
--- returns identical object with lowest ID
DECLARE @textoprueba as nvarchar(1000)
set @textoprueba='Blue'
select top 1 
case when t2.id is not null then t1.id end as t2id 
from tbtraducciones t1 LEFT join  tbtraducciones t2
on t1.resource_value = t2.resource_value AND (t1.culture_name='neutral' OR    t1.culture_name='en' OR t1.culture_name='es') 
AND t2.culture_name='neutral' AND (t1.Resource_name != t2.Resource_NAME OR t1.RESOURCE_OBJECT != t2.RESOURCE_OBJECT) 
where case when t2.id is not null then t2.id end is not null
and t2.resource_value=@textoprueba order by t1.id

--- Search for the ID of the 'Neutral' value for that object
DECLARE @minID as int
set @minID=[Result from previous SQL Statement]
select id from tbTraducciones 
where resource_name = (select resource_name from tbtraducciones where id=@minId) 
and resource_object= (select resource_object from tbtraducciones where id=@minId)
and culture_name = 'neutral'


---UPSERT DUPLICATES
DECLARE @idNeutral as int, @newId as Int
set @newID=[ID of the object with same 'Resource_Value' with higher ID]
set @idNeutral= [ID of the lowest id 'neutral' that has the same 'Resource_Value' in 'Neutral', 'ES' or 'EN']
;WITH T AS 
(
select 
t1.Resource_object , t1.Resource_name as Resource_Name, t3.Resource_Value AS     RESOURCE_VALUE, 
t3.Culture_Name AS CULTURE_NAME, t3.Id AS idMinCopy
from tbTraducciones t1 
JOIN tbTraducciones t2 on t2.id=@idNeutral 
right join tbTraducciones t3 on t3.Resource_Object = t2.Resource_Object AND t3.Resource_Name = t2.Resource_Name 
where t1.id=@newID and 
t3.culture_NAME <> 'Neutral'
)
UPDATE tbTraducciones SET
[RESOURCE_VALUE]=T.Resource_Value ,
[idMinCopy] = T.idMinCopy
FROM T
WHERE T.Resource_Object = tbTraducciones.Resource_Object
AND
tbTraducciones.Resource_Name = T.Resource_Name
AND
tbTraducciones.Culture_Name = T.Culture_Name
IF @@ROWCOUNT=0
INSERT INTO [tbTraducciones] (Resource_Object, Resource_Name, Resource_Value,     Culture_Name, idMinCopy)  
select t1.Resource_object, t1.Resource_name, t3.Resource_Value, t3.Culture_Name, t3.Id 
from tbTraducciones t1 
JOIN tbTraducciones t2 on t2.id=@idneutral 
join tbTraducciones t3 on t3.Resource_Object = t2.Resource_Object AND t3.Resource_Name =    t2.Resource_Name
where t1.id=@newID and t3.Culture_Name !='Neutral' 

我绝对确定在某个地方之前已经解决了这个问题,但我一直无法找到或采用所找到的解决方案。

感谢您的帮助。

1 个答案:

答案 0 :(得分:0)

这很丑陋。我的解决方案包含多达3个循环,并且在性能方面可能很糟糕,但我能够获得预期的结果。

也不确定您是否能够理解我的解决方案,但我尽可能尝试添加评论。

--simulation of actual table
DECLARE @tbTraducciones TABLE (
    [id] [int] IDENTITY(1,1) NOT NULL,
    [Resource_Object] [nvarchar](255) NULL,
    [Resource_Name] [nvarchar](128) NULL,
    [Resource_Value] [nvarchar](1000) NULL,
    [Culture_Name] [nvarchar](50) NULL,
    [idMinCopy] [int] NULL
    )

-- values for testing
INSERT INTO @tbTraducciones (Resource_Object,Resource_Name,Resource_Value,Culture_Name,idMinCopy) 
VALUES (
'home.aspx','label1.text','Blue','Neutral',NULL);

INSERT INTO @tbTraducciones (Resource_Object,Resource_Name,Resource_Value,Culture_Name,idMinCopy) 
VALUES ('home.aspx','label1.text','Blue','EN',NULL);
INSERT INTO @tbTraducciones (Resource_Object,Resource_Name,Resource_Value,Culture_Name,idMinCopy) 
VALUES ('home.aspx','label1.text','Azul','ES',NULL);
INSERT INTO @tbTraducciones (Resource_Object,Resource_Name,Resource_Value,Culture_Name,idMinCopy) 
VALUES ('home.aspx','label1.text','Bleu','FR',NULL);
INSERT INTO @tbTraducciones (Resource_Object,Resource_Name,Resource_Value,Culture_Name,idMinCopy) 
VALUES ('page1.aspx','lblColor.text','Azul','Neutral',NULL);
INSERT INTO @tbTraducciones (Resource_Object,Resource_Name,Resource_Value,Culture_Name,idMinCopy) 
VALUES ('page1.aspx','lblColor.text','Blue','EN',NULL);
INSERT INTO @tbTraducciones (Resource_Object,Resource_Name,Resource_Value,Culture_Name,idMinCopy) 
VALUES ('page1.aspx','lblColor.text','Azul','ES',NULL);
INSERT INTO @tbTraducciones (Resource_Object,Resource_Name,Resource_Value,Culture_Name,idMinCopy) 
VALUES ('page1.aspx','lblColor.text','Blau','CAT',NULL);


--Solution starts from here
DECLARE @StartCount INT, @MaxCount INT
DECLARE @Resource_Value NVARCHAR(128), @Culture_Name NVARCHAR(128), @previdMinCopy INT,
@Resource_Object NVARCHAR(255), @Resource_Name NVARCHAR(128)

DECLARE @NewStartCount INT
DECLARE @NewResource_Object NVARCHAR(255),  @NewResource_Name NVARCHAR(128)

DECLARE @FirstResource_Object NVARCHAR(255),    @FirstResource_Name NVARCHAR(128)

SELECT @FirstResource_Object = [Resource_Object],   
        @FirstResource_Name = [Resource_Name]
    FROM @tbTraducciones
    WHERE ID = 1

SELECT @StartCount = 1, @MaxCount = MAX(ID) FROM @tbTraducciones

--This loop will insert all the records which are present in one [Resource_Object] -- [Resource_Name] combination 
--but not present in the other
--excluding Culture_Name as 'neutral'
WHILE(@StartCount <= @MaxCount)
BEGIN
    SELECT @Resource_Object = [Resource_Object],    
                @Resource_Name = [Resource_Name],
                @Resource_Value = [Resource_Value],
                @Culture_Name = [Culture_Name]
    FROM @tbTraducciones
    WHERE ID = @StartCount

    IF(@Culture_Name != 'neutral')
    BEGIN
        SET @NewStartCount = 1

        WHILE (@NewStartCount <= @MaxCount)
        BEGIN
            IF(@NewStartCount != @StartCount)
            BEGIN
                IF NOT EXISTS(SELECT 0 FROM @tbTraducciones
                                WHERE [Resource_Object] = @Resource_Object
                                AND [Resource_Name] = @Resource_Name
                                AND ID = @NewStartCount)
                BEGIN
                    SELECT @NewResource_Object = [Resource_Object], 
                        @NewResource_Name = [Resource_Name]
                    FROM @tbTraducciones
                    WHERE ID = @NewStartCount



                    IF NOT EXISTS(SELECT 0 FROM @tbTraducciones
                                WHERE [Resource_Value] = @Resource_Value
                                AND [Culture_Name] = @Culture_Name
                                AND [Resource_Object] = @NewResource_Object
                                AND [Resource_Name] = @NewResource_Name)
                    BEGIN
                        INSERT INTO @tbTraducciones (Resource_Object,Resource_Name,Resource_Value,Culture_Name)
                        VALUES (@NewResource_Object, @NewResource_Name, @Resource_Value, @Culture_Name)

                        UPDATE @tbTraducciones
                        SET [idMinCopy] = ID
                        WHERE ID IN (SELECT MAX(ID) FROM @tbTraducciones)
                        AND [Resource_Object] = @FirstResource_Object
                        AND [Resource_Name] = @FirstResource_Name
                    END
                END
            END

            SET @NewStartCount = @NewStartCount + 1
        END
    END

    SET @StartCount = @StartCount + 1
END

SELECT @StartCount = 1, @MaxCount = MAX(ID) FROM @tbTraducciones

--This loop will start updating "idMinCopy" for each row based on the logic
WHILE(@StartCount <= @MaxCount)
BEGIN
    SELECT @Resource_Value = [Resource_Value],
        @Culture_Name = [Culture_Name]
    FROM @tbTraducciones
    WHERE ID = @StartCount

    IF(@Culture_Name = 'neutral')
    BEGIN
        IF NOT EXISTS(SELECT 0 FROM @tbTraducciones 
                        WHERE [Culture_Name] = @Culture_Name
                        AND ID < @StartCount
                        AND idMinCopy IS NOT NULL)
        BEGIN
            UPDATE @tbTraducciones
            SET idMinCopy = @StartCount
            WHERE ID = @StartCount
        END
        ELSE
        BEGIN
            SELECT @previdMinCopy = MIN(idMinCopy)
            FROM @tbTraducciones
            WHERE [Culture_Name] = @Culture_Name
            AND ID < @StartCount
            AND idMinCopy IS NOT NULL

            UPDATE @tbTraducciones
            SET idMinCopy = @previdMinCopy
            WHERE ID = @StartCount
        END
    END
    ELSE
    BEGIN
        IF NOT EXISTS(SELECT 0 FROM @tbTraducciones 
                        WHERE [Resource_Value] = @Resource_Value
                        AND [Culture_Name] = @Culture_Name
                        AND idMinCopy IS NOT NULL)
        BEGIN
            UPDATE @tbTraducciones
            SET idMinCopy = @StartCount
            WHERE ID = @StartCount

        END
        ELSE
        BEGIN
            SELECT @previdMinCopy = MIN(idMinCopy)
            FROM @tbTraducciones
            WHERE [Resource_Value] = @Resource_Value
            AND [Culture_Name] = @Culture_Name
            AND idMinCopy IS NOT NULL

            UPDATE @tbTraducciones
            SET idMinCopy = @previdMinCopy
            WHERE ID = @StartCount

        END
    END

    SET @StartCount = @StartCount + 1

END


SELECT * FROM @tbTraducciones

希望这有帮助。