每列的每个不同值

时间:2016-06-08 10:59:23

标签: sql sql-server

我想创建一个动态选择,它返回宽表中每列的每个不同值。即

select distinct @mycolumn 
from @mytable

表示每个列,结果合并为一个表。

Edit1

实施例: Example

Edit2 :返回数据的顺序无关紧要,源表可以包含各种数据类型。

任何建议表示赞赏,谢谢!

2 个答案:

答案 0 :(得分:2)

我能想到的唯一方法是非常麻烦,可能非常慢: 使用Tally表(为了这个答案,我使用递归cte生成了一个表,但这也不是一个非常好的方法...)和多个派生表连接到该计数表我能够想出一些可以产生所需输出的东西 然而,正如我在顶部写的那样 - 它非常繁琐且可能非常慢(我只在一个有5列6行的表上测试过,所以我不知道执行速度)。

DECLARE @Count int
select  @Count = COUNT(1) 
FROM YourTable

;with tally as (
    select 1 as n
    union all 
    select n + 1
    from tally 
    where n < @Count 
)

SELECT Column1, Column2, Column3, Column4, Column5
FROM tally 
LEFT JOIN 
(
    SELECT Column1, ROW_NUMBER() OVER (ORDER BY Column1) rn
    FROM 
    (
        SELECT DISTINCT Column1
        FROM YourTable 
    ) t1
) d1 ON(n = d1.rn)
LEFT JOIN 
(
    SELECT Column2, ROW_NUMBER() OVER (ORDER BY Column2) rn
    FROM 
    (
        SELECT DISTINCT Column2
        FROM YourTable 
    ) t1
) d2 ON(n = d2.rn)
LEFT JOIN 
(
    SELECT Column3, ROW_NUMBER() OVER (ORDER BY Column3) rn
    FROM 
    (
        SELECT DISTINCT Column3
        FROM YourTable 
    ) t1
) d3 ON(n = d3.rn)
LEFT JOIN 
(
    SELECT Column4, ROW_NUMBER() OVER (ORDER BY Column4) rn
    FROM 
    (
        SELECT DISTINCT Column4
        FROM YourTable 
    ) t1
) d4 ON(n = d4.rn)
LEFT JOIN 
(
    SELECT Column5, ROW_NUMBER() OVER (ORDER BY Column5) rn
    FROM 
    (
        SELECT DISTINCT Column5
        FROM YourTable 
    ) t1
) d5 ON(n = d5.rn)

动态版本:

DECLARE @TableName sysname = 'YourTableName'

DECLARE @Sql nvarchar(max) = 
'
DECLARE @Count int
select  @Count = COUNT(1) 
FROM '+  @TableName +'

;with tally as (
    select 1 as n
    union all 
    select n + 1
    from tally 
    where n < @Count 
)

SELECT '

SELECT @Sql = @Sql + Column_Name +','
FROM INFORMATION_SCHEMA.COLUMNS 
WHERE TABLE_NAME = @TableName

SELECT @Sql = LEFT(@Sql, LEN(@Sql) - 1) + ' FROM tally t'

SELECT @Sql = @Sql + ' LEFT JOIN (SELECT '+ Column_Name +', ROW_NUMBER() OVER (ORDER BY ' + Column_Name +') rn
  FROM 
  (
    SELECT DISTINCT '+ Column_Name +' FROM '+ @TableName +') t
  ) c_'+ Column_Name + ' ON(n =  c_'+ Column_Name + '.rn)'
FROM INFORMATION_SCHEMA.COLUMNS 
WHERE TABLE_NAME = @TableName

EXEC(@Sql)

<强>更新

在一个包含22列和47,000行的表上测试过,在Sql server 2014上使用proper tally table.时,我的建议花了46秒。 我很惊讶 - 我认为至少需要2-3分钟。

答案 1 :(得分:1)

这是我正在制作的动态集。我已经没时间了,所以它没有被清理掉,并且它确定了动态行数作为整个表中的最大行数,这意味着如果你在任何列中都有任何重复,那么你将是留下行,其中每一列都为空。

但除此之外,这应该完全正常,并且脚本包含必要的信息,向您展示如何连接最终的“WHERE S1.COLNAME IS NOT NULL AND S2.COLNAME IS NOT NULL AND ..”过滤器到结果表,以消除那些完整的空行。

除此之外,这是脚本。显然,它会很重,所以我在其中包含一个(nolock)提示,并且“WHERE ColName不为null”以删除无用的结果。

在较小的桌子上试一试,看看它是否有效。

/*
Set your table and schema on @MYTABLE and @MYSCHEMA variables.
*/
SET NOCOUNT ON

DECLARE @MYTABLE SYSNAME = 'Mytablename here'
    , @MYSCHEMA sysname = 'dbo'

DECLARE @SQL NVARCHAR(MAX) = '', @COLNAME sysname = '', @MYCOLS NVARCHAR(max) = ''

DECLARE @COL_NOW INT = 1, @COL_MAX INT = 
    (SELECT COUNT(*) 
    FROM sys.columns
    WHERE object_id = (SELECT object_id FROM sys.tables where name = @MYTABLE and SCHEMA_NAME(schema_id) = @MYSCHEMA))

SELECT @COLNAME = name 
    FROM sys.columns 
    WHERE column_id = 1
    and object_id = (SELECT object_id FROM sys.tables where name = @MYTABLE and SCHEMA_NAME(schema_id) = @MYSCHEMA)

SET @SQL = 'FROM 
    (SELECT ROW_NUMBER() OVER (ORDER BY '+@COLNAME+' ASC) RN
    FROM '+@MYSCHEMA+'.'+@MYTABLE+' (nolock)) S'

WHILE @COL_NOW <= @COL_MAX
BEGIN

    SELECT @COLNAME = name 
    FROM sys.columns 
    WHERE column_id = @COL_NOW 
    and object_id = (SELECT object_id FROM sys.tables where name = @MYTABLE and SCHEMA_NAME(schema_id) = @MYSCHEMA)

    SELECT @SQL = @SQL+'
FULL JOIN 
    (SELECT DISTINCT DENSE_RANK() OVER (ORDER BY '+@COLNAME+' ASC) RN, '+@COLNAME+'
    FROM '+@MYSCHEMA+'.'+@MYTABLE+' (nolock) 
    WHERE '+@COLNAME+' IS NOT NULL) S'+CAST(@COL_NOW AS NVARCHAR(25))+' ON S'+CAST(@COL_NOW AS NVARCHAR(25))+'.RN = S.RN'

    IF @COL_NOW = 1
        SELECT @MYCOLS = @MYCOLS+' S'+CAST(@COL_NOW AS NVARCHAR(25))+'.'+@COLNAME           
    ELSE
        SELECT @MYCOLS = @MYCOLS+', S'+CAST(@COL_NOW AS NVARCHAR(25))+'.'+@COLNAME

    SET @COL_NOW = @COL_NOW+1

END

SELECT @SQL = 'SELECT'+@MYCOLS+'
'+@SQL+'
ORDER BY S1.RN ASC';

--PRINT(@SQL); -- To check resulting dynamic SQL without executing it (Warning, print will only show first 8k characters)

EXEC sp_executesql @SQL;
GO