如何根据列中的空值计算多个表的数据完整性?

时间:2015-08-21 20:19:16

标签: sql sql-server database-design

下面的查询计算我们需要的内容,但仅针对一个特定列。我们如何为该表中的所有列执行此操作,而不必多次复制case语句。这需要针对数百个表进行,因此复制case语句并不理想。

 Select SUM(cast(case when column is null then 0 else 1 end as float))/count(*) from [Table]

所以输出就像

列名:数据完整性

客户名称:88%

5 个答案:

答案 0 :(得分:0)

首先,您可以将逻辑简化为:

Select AVG(case when column is null then 0.0 else 1.0 end)
from [Table]

然后,您可以生成代码。以下内容生成from表达式。您可以将它们复制到查询中:

select replace('      avg(case when [@col] is null then 0.0 else 1.0 end) as [@col],',
               '@col', column_name)
from information_schema.columns
where table_name = @TableName and table_schema = @SchemaName

注意:quotename()更正确,但上述内容适用于合理的列名(我从不会引用需要引用的列名)。

答案 1 :(得分:0)

来自Finding the percentage of NULL values for each column in a table

的Jens Suessmeyer的解决方案
SET NOCOUNT ON
DECLARE @Statement NVARCHAR(MAX) = ''
DECLARE @Statement2 NVARCHAR(MAX) = ''
DECLARE @FinalStatement NVARCHAR(MAX) = ''

DECLARE @TABLE_SCHEMA SYSNAME = <SCHEMA_NAME>
DECLARE @TABLE_NAME SYSNAME = <TABLE_NAME>

SELECT
        @Statement = @Statement + 'SUM(CASE WHEN ' + COLUMN_NAME + ' IS NULL THEN 1 ELSE 0 END) AS ' + COLUMN_NAME + ',' + CHAR(13) ,
        @Statement2 = @Statement2 + COLUMN_NAME + '*100 / OverallCount AS ' + COLUMN_NAME + ',' + CHAR(13)
FROM INFORMATION_SCHEMA.COLUMNS
WHERE TABLE_NAME = @TABLE_NAME 
    AND TABLE_SCHEMA = @TABLE_SCHEMA

IF @@ROWCOUNT = 0
    RAISERROR('TABLE OR VIEW with schema "%s" and name "%s" does not exists or you do not have appropriate permissions.',16,1, @TABLE_SCHEMA, @TABLE_NAME)
ELSE
BEGIN
    SELECT @FinalStatement =
            'SELECT ' + LEFT(@Statement2, LEN(@Statement2) -2) + ' FROM (SELECT ' + LEFT(@Statement, LEN(@Statement) -2) +
            ', COUNT(*) AS OverallCount FROM ' + @TABLE_SCHEMA + '.' + @TABLE_NAME + ') SubQuery'
    EXEC(@FinalStatement)
END

答案 2 :(得分:0)

这样的事情应该有效。基本上构建一个语句,使用sys.tables和sys.columns选择每个表中每列的计数,然后执行该语句。

Begin
    Select @sqlcmd = 'insert into mystats (TableName, ColumnName, TotCount)
             Values (select ''' + t.name + ''', ''' + c.name + ''', count(' + c.name + ') from ' + t.name + ')'
    From sys.tables t inner join sys.columns c
    On c.object_id = t.object_id

    EXEC @sqlcmd
END

答案 3 :(得分:0)

您可以使用UNPIVOT查询为您执行此操作,例如.....在以下查询中,我假设您有3列<< AMESW7>,可以扩展查询以容纳尽可能多的列

查询

Column1,Column2,Column3

结果集

SELECT ColumnName 
      , SUM(cast(case when Vals = '' then 0.0 else 1.0 end as DECIMAL(10,2))) * 100
      / COUNT(*)  AS [Percetage]
FROM (
SELECT CAST(ISNULL(Column1, '') AS VARCHAR(100)) AS Column1
      ,CAST(ISNULL(Column2, '') AS VARCHAR(100)) AS Column2
      ,CAST(ISNULL(Column3, '') AS VARCHAR(100)) AS Column3
FROM TableName
  )c
  UNPIVOT (Vals FOR ColumnName IN (Column1,Column2,Column3))up
GROUP BY ColumnName

重要提示

确保将UNPIVOT IN子句中使用的所有列转换为统一数据类型。

同样使用╔════════════╦════════════╗ ║ ColumnName ║ Percetage ║ ╠════════════╬════════════╣ ║ Column1 ║ 100.000000 ║ ║ Column2 ║ 100.000000 ║ ║ Column3 ║ 34.065934 ║ ╚════════════╩════════════╝ 非常重要,因为UNPIVOT会消除任何空值。

答案 4 :(得分:0)

我的回答结合了来自lad2025答案的样本和来自M.Ali的答案中的UNPIVOT,为您提供了一个结果集,每列包含一列的名称和空值的百分比。它将按空值百分比的降序显示它们。

SET NOCOUNT ON
DECLARE @Statement NVARCHAR(MAX) = ''
DECLARE @Statement2 NVARCHAR(MAX) = ''
DECLARE @Statement3 NVARCHAR(MAX) = ''
DECLARE @FinalStatement NVARCHAR(MAX) = ''

DECLARE @TABLE_SCHEMA SYSNAME = <SCHEMA_Name>
DECLARE @TABLE_NAME SYSNAME = <TABLE_Name>

SELECT
        @Statement = @Statement + 'SUM(CASE WHEN ' + COLUMN_NAME + 
            ' IS NULL THEN 1 ELSE 0 END) AS ' + COLUMN_NAME + ',' + CHAR(13) ,
        @Statement2 = @Statement2 + COLUMN_NAME + 
            '*100 / OverallCount AS ' + COLUMN_NAME + ',' + CHAR(13),
        @Statement3 = @Statement3 + COLUMN_NAME + ','
FROM INFORMATION_SCHEMA.COLUMNS
WHERE TABLE_NAME = @TABLE_NAME 
    AND TABLE_SCHEMA = @TABLE_SCHEMA

IF @@ROWCOUNT = 0
    RAISERROR('TABLE OR VIEW with schema "%s" and name "%s" does not exists or you do not have appropriate permissions.',16,1, @TABLE_SCHEMA, @TABLE_NAME)
ELSE
BEGIN
    SELECT @FinalStatement =
            'SELECT u.ColumnName, u.NullPercentage FROM (SELECT ' + 
            LEFT(@Statement2, LEN(@Statement2) -2) + 
            ' FROM (SELECT ' + LEFT(@Statement, LEN(@Statement) -2) +
            ', COUNT(*) AS OverallCount FROM ' + @TABLE_SCHEMA + '.' + @TABLE_NAME + 
            ') SubQuery) PercentageQuery unpivot (NullPercentage for ColumnName in (' + 
            LEFT(@Statement3, LEN(@Statement3) - 1) + 
            ')) u ORDER BY NullPercentage DESC'
    EXEC(@FinalStatement)
END