现在在集合中首先重复值

时间:2015-06-27 03:33:15

标签: sql sql-server sql-server-2008 tsql

我从XML文件中检索了一组数据,现在我想将它们解析为不同的记录,以插入到我已经由不同列定义的表中。

在这个例子中,我发现源XML文件中有11个不同的列。我已经找到了列的相应排序顺序,以便它们始终以相同的顺序插入到表中。

不同的表格列示例

 1  ad1_pk
 2  ad1_address1
 3  ad1_address2
 4  ad1_address3
 5  ad1_ctyfk
 6  ad1_postalcode
 7  ad1_active
 8  ad1_irstat
 9  ad1_irdata
 10 ad1_at1fk
 11 ad1_mk

使用以下代码,我已经准备好了数据来定义记录":

SELECT TOP 100 t.drn,
z.ID,
null RecNum,
z.Name,
z.Value
--INTO zTempXMLDataConverted2
FROM dbo.zTempXMLDataConverted z
JOIN #tempSortOrder t ON t.node = z.Name
ORDER BY z.id,t.drn

...但是,我只有一个问题:我可以直观地看到"看到"不同的记录,但我不知道如何以编程方式确定每条记录(也就是将" RecNum"列设置为适当的记录值。

所以,这最终是我想要看到的:

drn ------- ID ------ RecNum ------ Name -------------- Value
  1 ------   1 ------      1 ------ ad1_pk          --- 1
  2 ------   2 ------      1 ------ ad1_address1    --- P.O. Box 5036
  3 ------   3 ------      1 ------ ad1_address2    --- NULL
  4 ------   4 ------      1 ------ ad1_address3    --- NULL
  5 ------   5 ------      1 ------ ad1_ctyfk       --- 56
  6 ------   6 ------      1 ------ ad1_postalcode  --- 80155-5036
  7 ------   7 ------      1 ------ ad1_active      --- Y
  8 ------   8 ------      1 ------ ad1_irstat      --- A
  9 ------   9 ------      1 ------ ad1_irdata      --- NULL
 10 ------  10 ------      1 ------ ad1_at1fk       --- 1
  1 ------  11 ------      2 ------ ad1_pk          --- 2
  2 ------  12 ------      2 ------ ad1_address1    --- 1871 S. Broadway
  3 ------  13 ------      2 ------ ad1_address2    --- NULL
  4 ------  14 ------      2 ------ ad1_address3    --- NULL
  5 ------  15 ------      2 ------ ad1_ctyfk       --- 1
  6 ------  16 ------      2 ------ ad1_postalcode  --- 80210
  7 ------  17 ------      2 ------ ad1_active      --- Y
  8 ------  18 ------      2 ------ ad1_irstat      --- A
  9 ------  19 ------      2 ------ ad1_irdata      --- NULL
 10 ------  20 ------      2 ------ ad1_at1fk       --- 1
  1 ------  21 ------      3 ------ ad1_pk          --- 3
  2 ------  22 ------      3 ------ ad1_address1    --- 1575 W. 124th Avenue
  3 ------  23 ------      3 ------ ad1_address2    --- NULL
  4 ------  24 ------      3 ------ ad1_address3    --- NULL
  5 ------  25 ------      3 ------ ad1_ctyfk       --- 1
  6 ------  26 ------      3 ------ ad1_postalcode  --- 80234
  7 ------  27 ------      3 ------ ad1_active      --- Y
  8 ------  28 ------      3 ------ ad1_irstat      --- A
  9 ------  29 ------      3 ------ ad1_irdata      --- NULL
 10 ------  30 ------      3 ------ ad1_at1fk       --- 1
  1 ------  31 ------      4 ------ ad1_pk          --- 4
  2 ------  32 ------      4 ------ ad1_address1    --- 6175 S. Forest Court
  3 ------  33 ------      4 ------ ad1_address2    --- NULL
  4 ------  34 ------      4 ------ ad1_address3    --- NULL
  5 ------  35 ------      4 ------ ad1_ctyfk       --- 101
  6 ------  36 ------      4 ------ ad1_postalcode  --- 80121
  7 ------  37 ------      4 ------ ad1_active      --- Y
  8 ------  38 ------      4 ------ ad1_irstat      --- A
  9 ------  39 ------      4 ------ ad1_irdata      --- NULL
 10 ------  40 ------      4 ------ ad1_at1fk       --- 1
  1 ------  41 ------      5 ------ ad1_pk          --- 5
  2 ------  42 ------      5 ------ ad1_address1    --- 2000 E. County Line Rd., C-127
  3 ------  43 ------      5 ------ ad1_address2    --- NULL
  4 ------  44 ------      5 ------ ad1_address3    --- NULL
  5 ------  45 ------      5 ------ ad1_ctyfk       --- 87
  6 ------  46 ------      5 ------ ad1_postalcode  --- 80126
  7 ------  47 ------      5 ------ ad1_active      --- Y
  8 ------  48 ------      5 ------ ad1_irstat      --- A
  9 ------  49 ------      5 ------ ad1_irdata      --- NULL
 10 ------  50 ------      5 ------ ad1_at1fk       --- 1
  1 ------  51 ------      6 ------ ad1_pk          --- 6
  2 ------  52 ------      6 ------ ad1_address1    --- 6911 S. University Blvd
  3 ------  53 ------      6 ------ ad1_address2    --- NULL
  4 ------  54 ------      6 ------ ad1_address3    --- NULL
  5 ------  55 ------      6 ------ ad1_ctyfk       --- 101
  6 ------  56 ------      6 ------ ad1_postalcode  --- 80122
  7 ------  57 ------      6 ------ ad1_active      --- Y
  8 ------  58 ------      6 ------ ad1_irstat      --- A
  9 ------  59 ------      6 ------ ad1_irdata      --- NULL
 10 ------  60 ------      6 ------ ad1_at1fk       --- 1
  1 ------  61 ------      7 ------ ad1_pk          --- 7
  2 ------  62 ------      7 ------ ad1_address1    --- 215 Union Blvd., Suite 350
  3 ------  63 ------      7 ------ ad1_address2    --- NULL
  4 ------  64 ------      7 ------ ad1_address3    --- NULL
  5 ------  65 ------      7 ------ ad1_ctyfk       --- 96
  6 ------  66 ------      7 ------ ad1_postalcode  --- 80228
  7 ------  67 ------      7 ------ ad1_active      --- Y
  8 ------  68 ------      7 ------ ad1_irstat      --- A
  9 ------  69 ------      7 ------ ad1_irdata      --- NULL
 10 ------  70 ------      7 ------ ad1_at1fk       --- 1
  1 ------  71 ------      8 ------ ad1_pk          --- 8
  2 ------  72 ------      8 ------ ad1_address1    --- 9700 E. Powers Avenue
  3 ------  73 ------      8 ------ ad1_address2    --- NULL
  4 ------  74 ------      8 ------ ad1_address3    --- NULL
  5 ------  75 ------      8 ------ ad1_ctyfk       --- 56
  6 ------  76 ------      8 ------ ad1_postalcode  --- 80111
  7 ------  77 ------      8 ------ ad1_active      --- Y
  8 ------  78 ------      8 ------ ad1_irstat      --- A
  9 ------  79 ------      8 ------ ad1_irdata      --- NULL
 10 ------  80 ------      8 ------ ad1_at1fk       --- 1
  1 ------  81 ------      9 ------ ad1_pk          --- 9
  2 ------  82 ------      9 ------ ad1_address1    --- 650 S. Cherry Street, #420
  3 ------  83 ------      9 ------ ad1_address2    --- NULL
  4 ------  84 ------      9 ------ ad1_address3    --- NULL
  5 ------  85 ------      9 ------ ad1_ctyfk       --- 1
  6 ------  86 ------      9 ------ ad1_postalcode  --- 80222
  7 ------  87 ------      9 ------ ad1_active      --- Y
  8 ------  88 ------      9 ------ ad1_irstat      --- A
  9 ------  89 ------      9 ------ ad1_irdata      --- NULL
 10 ------  90 ------      9 ------ ad1_at1fk       --- 1
  1 ------  91 ------     10 ------ ad1_pk          --- 10
  2 ------  92 ------     10 ------ ad1_address1    --- 1001 17th St.
  3 ------  93 ------     10 ------ ad1_address2    --- Suite 1800
  4 ------  94 ------     10 ------ ad1_address3    --- NULL
  5 ------  95 ------     10 ------ ad1_ctyfk       --- 1
  6 ------  96 ------     10 ------ ad1_postalcode  --- 80202
  7 ------  97 ------     10 ------ ad1_active      --- Y
  8 ------  98 ------     10 ------ ad1_irstat      --- A
  9 ------  99 ------     10 ------ ad1_irdata      --- NULL
 10 ------ 100 ------     10 ------ ad1_at1fk       --- 1

而这最终是我想要的:

ad1_pk --- ad1_address1 ------------------------ ad1_address2 ---- ad1_address3 ---- ad1_ctyfk ---- ad1_postalcode ---- ad1_active ---- ad1_irstat ---- ad1_irdata ---- ad1_at1fk
     1 --- P.O. Box 5036                     --- NULL         ---- NULL         ---- 56        ---- 80155-5036     ---- Y          ---- A          ---- NULL       ---- 1         
     2 --- 1871 S. Broadway                  --- NULL         ---- NULL         ---- 1         ---- 80210          ---- Y          ---- A          ---- NULL       ---- 1         
     3 --- 1575 W. 124th Avenue              --- NULL         ---- NULL         ---- 1         ---- 80234          ---- Y          ---- A          ---- NULL       ---- 1         
     4 --- 6175 S. Forest Court              --- NULL         ---- NULL         ---- 101       ---- 80121          ---- Y          ---- A          ---- NULL       ---- 1         
     5 --- 2000 E. County Line Rd., C-127    --- NULL         ---- NULL         ---- 87        ---- 80126          ---- Y          ---- A          ---- NULL       ---- 1         
     6 --- 6911 S. University Blvd           --- NULL         ---- NULL         ---- 101       ---- 80122          ---- Y          ---- A          ---- NULL       ---- 1         
     7 --- 215 Union Blvd., Suite 350        --- NULL         ---- NULL         ---- 96        ---- 80228          ---- Y          ---- A          ---- NULL       ---- 1         
     8 --- 9700 E. Powers Avenue             --- NULL         ---- NULL         ---- 56        ---- 80111          ---- Y          ---- A          ---- NULL       ---- 1         
     9 --- 650 S. Cherry Street, #420        --- NULL         ---- NULL         ---- 1         ---- 80222          ---- Y          ---- A          ---- NULL       ---- 1         
    10 --- 1001 17th St.                     --- Suite 1800   ---- NULL         ---- 1         ---- 80202          ---- Y          ---- A          ---- NULL       ---- 1         

这可能吗?

=============================================== =========================

**编辑:透视表的可能解决方案**

DECLARE @thisPK int
DECLARE @thisDRN INT
DECLARE @thisDRN2 INT
DECLARE @thisDRN3 int
DECLARE @thisIDStart INT
DECLARE @thisIDEnd INT
DECLARE @thisNumIDsAffected int

DECLARE @thisID int
DECLARE @thisColumn VARCHAR(255)
DECLARE @thisValue VARCHAR(255)
--DECLARE @thisRecNum VARCHAR(MAX) = 1 -- the record indicator can be a char(1) or an int

DECLARE @distinctColumnsCount INT = (SELECT COUNT(*) [Cnt] FROM (SELECT DISTINCT Name FROM dbo.zTempXMLDataConverted2) Step1 )
DECLARE @currentImportTable VARCHAR(MAX)='imaddad1'
DECLARE @SQL VARCHAR(MAX)
DECLARE @SQLcolumnNames VARCHAR(MAX)
DECLARE @SQLcolumnValues VARCHAR(MAX)

DECLARE recordGroupCursor CURSOR FOR (
    SELECT DENSE_RANK() OVER (ORDER BY NumIDsAffected) drn,NumIDsAffected FROM (SELECT DISTINCT NumIDsAffected FROM #tempUpdateRanges) t1
    WHERE NumIDsAffected=9
)
OPEN recordGroupCursor
FETCH NEXT FROM recordGroupCursor INTO @thisDRN,@thisNumIDsAffected
WHILE @@FETCH_STATUS=0
BEGIN
    SET @SQL='insert into '+@currentImportTable+' '
    SET @SQLcolumnNames='('
    SET @SQLcolumnValues=''

    DECLARE recordPullCursor CURSOR FOR (
        SELECT DENSE_RANK() OVER (ORDER BY PK) drn,PK,IDStart,IDEnd FROM #tempUpdateRanges
        WHERE NumIDsAffected=@thisNumIDsAffected
    )
    OPEN recordPullCursor
    FETCH NEXT FROM recordPullCursor INTO @thisDRN2,@thisPK,@thisIDStart,@thisIDEnd--,@thisNumIDsAffected
    WHILE @@FETCH_STATUS=0 --AND @thisDRN2<10
    BEGIN
        --if there happens to be any anomalies in the data, where there are more columns present in a record
        --then the distinct column count, then only get the number of columns that can fit into the table
        IF @thisNumIDsAffected>@distinctColumnsCount BEGIN
            SET @thisIDEnd = @thisIDStart+@distinctColumnsCount
        END
        SET @SQLcolumnValues=@SQLcolumnValues
        DECLARE recordDetailsCursor CURSOR FOR (
            SELECT DENSE_RANK() OVER (ORDER BY ID) drn,ID,Name,Value FROM dbo.zTempXMLDataConverted2 WHERE ID between @thisIDStart AND @thisIDEnd
        )
        OPEN recordDetailsCursor
        FETCH NEXT FROM recordDetailsCursor INTO @thisDRN3,@thisID,@thisColumn,@thisValue
        WHILE @@FETCH_STATUS=0
        BEGIN
            IF @thisDRN2=1 BEGIN
                SET @SQLcolumnNames += @thisColumn+','+CHAR(10)
            END
            SET @SQLcolumnValues += ''''+ISNULL(@thisValue,'')+''','
            FETCH NEXT FROM recordDetailsCursor INTO @thisDRN3,@thisID,@thisColumn,@thisValue
        END
        CLOSE recordDetailsCursor
        DEALLOCATE recordDetailsCursor
        SET @SQLcolumnValues = LEFT(@SQLcolumnValues,LEN(@SQLcolumnValues)-1)+' UNION ALL '+CHAR(10)+'SELECT '
        --PRINT @SQLcolumnValues
        FETCH NEXT FROM recordPullCursor INTO @thisDRN2,@thisPK,@thisIDStart,@thisIDEnd--,@thisNumIDsAffected
    END
    CLOSE recordPullCursor
    DEALLOCATE recordPullCursor

    SET @SQL += LEFT(@SQLcolumnNames,LEN(@SQLcolumnNames)-2)+') '+CHAR(10)+'SELECT '+LEFT(@SQLcolumnValues,LEN(@SQLcolumnValues)-11)
    BEGIN TRY
        --SET @SQLerrors=''
        --PRINT @SQL
        EXEC (@SQL)
        --RAISERROR('  Record %10i Processed',0,1,@thisPK) WITH NOWAIT
    END TRY
    BEGIN CATCH
        --SET @SQLerrors = 'ERROR!: Creating table '+@currentImportTable+CHAR(10)+ERROR_MESSAGE()
        PRINT @SQL+CHAR(10)+ERROR_MESSAGE()
        --GOTO EARLY_END
    END CATCH
    FETCH NEXT FROM recordGroupCursor INTO @thisDRN,@thisNumIDsAffected
END
CLOSE recordGroupCursor
DEALLOCATE recordGroupCursor

SET NOCOUNT OFF;

1 个答案:

答案 0 :(得分:1)

为了让drn重新开始时更改RecNum,你可以使用这种逻辑:

select
  drn, id,
  dense_rank() over (order by GRP) as RecNum
from (
  select drn, id, id-drn as GRP from table1
) X
order by id

这将使用drn和id之间的差异来确定哪个&#34;组&#34;该行属于并使用密集排名来对行进行编号。您可以在SQL Fiddle中查看此内容。假设数据中没有跳跃/间隙或任何奇怪的内容,否则RecNum也会跳跃。

根据编辑,我会假设您的最终专栏未修复,因此您需要进行动态调整。在SO中有很多例子,例如here