TSQL:检查列中是否存在至少一个非空值

时间:2015-07-16 19:41:47

标签: sql sql-server tsql

我想编写一个TSQL查询,该查询独立地检查表中的一组列,以查看哪些列包含至少一个非空值。每列的检查应相应地返回T / F(1/0)。

首先想到的是使用COUNT聚合函数。由于COUNT(expression)从结果总数中排除了空值,因此如果COUNT是> 0,有非空数据。

这似乎有点笨拙,因为它必须统计所有数据。我真的只需要知道每列中是否至少有一个非空值:

    SELECT 
        CAST(CASE WHEN COUNT(t.Column1) > 0 THEN 1 ELSE 0 END AS BIT) AS HasColumn1Data,
        CAST(CASE WHEN COUNT(t.Column2) > 0 THEN 1 ELSE 0 END AS BIT) AS HasColumn2Data,
        CAST(CASE WHEN COUNT(t.Column3) > 0 THEN 1 ELSE 0 END AS BIT) AS HasColumn3Data,
        CAST(CASE WHEN COUNT(t.Column4) > 0 THEN 1 ELSE 0 END AS BIT) AS HasColumn4Data
    FROM dbo.Table AS t
    WHERE t.TimeStamp BETWEEN @StartTimeStamp AND @EndTimeStamp

任何可能更优化的想法?

5 个答案:

答案 0 :(得分:0)

如果列上有索引,则以下内容可能会更快:

select (case when exists (select 1
                          from table t
                          where t.TimeStamp BETWEEN @StartTimeStamp and @EndTimeStamp and
                                column1 is not null
                         )
             then 1 else 0 end) as HasColumn1Data,
       (case when exists (select 1
                          from table t
                          where t.TimeStamp BETWEEN @StartTimeStamp and @EndTimeStamp and
                                column2 is not null
                         )
             then 1 else 0 end) as HasColumn2Data,
       (case when exists (select 1
                          from table t
                          where t.TimeStamp BETWEEN @StartTimeStamp and @EndTimeStamp and
                                column3 is not null
                         )
             then 1 else 0 end) as HasColumn3Data,
       (case when exists (select 1
                          from table t
                          where t.TimeStamp BETWEEN @StartTimeStamp and @EndTimeStamp and
                                column4 is not null
                         )
             then 1 else 0 end) as HasColumn4Data;

没有索引,这将是大约4次全表扫描(诚然,在第一个非NULL值时截断),因此它可能比group by

答案 1 :(得分:0)

这可能最终会变得更加麻烦,但使用EXISTS代替COUNT可能更为理想:

SELECT  CAST(CASE WHEN EXISTS(SELECT * FROM Table t WHERE t.Column1 IS NOT NULL AND t.TimeStamp BETWEEN @StartTimeStamp AND @EndTimeStamp) THEN 1 ELSE 0 END AS BIT) AS HasColumn1Data,
        CAST(CASE WHEN EXISTS(SELECT * FROM Table t WHERE t.Column2 IS NOT NULL AND t.TimeStamp BETWEEN @StartTimeStamp AND @EndTimeStamp) THEN 1 ELSE 0 END AS BIT) AS HasColumn2Data,
        CAST(CASE WHEN EXISTS(SELECT * FROM Table t WHERE t.Column3 IS NOT NULL AND t.TimeStamp BETWEEN @StartTimeStamp AND @EndTimeStamp) THEN 1 ELSE 0 END AS BIT) AS HasColumn3Data,
        CAST(CASE WHEN EXISTS(SELECT * FROM Table t WHERE t.Column4 IS NOT NULL AND t.TimeStamp BETWEEN @StartTimeStamp AND @EndTimeStamp) THEN 1 ELSE 0 END AS BIT) AS HasColumn4Data

答案 2 :(得分:0)

我会将查询转动为仅生成字段名称以及它们是否为非空:

==>type labels.bat
@ECHO %1>NUL
if ""=="" (
  rem comment
  @echo a rem comment precedes this echo command
)
if ""=="" (
  :: comment
  @echo a label-like comment precedes this echo command
)
if ""=="" (
  :label
  @echo a label precedes this echo command
)
if "" == "" (@echo a simple echo, no comments)
@goto :eof

==>labels.bat off
a rem comment precedes this echo command
'@echo' is not recognized as an internal or external command,
operable program or batch file.
'@echo' is not recognized as an internal or external command,
operable program or batch file.
a simple echo, no comments

==>

...

顺便说一句,你应该可以用以下内容自动生成上述查询:

SELECT
  'COL1' AS column_name,
  CONVERT( BIT, COUNT( 1 ) ) AS is_not_entirely_null
FROM
  foo
WHERE
  column1 IS NOT NULL
UNION
SELECT
  'COL2' AS column_name,
  CONVERT( BIT, COUNT( 1 ) ) AS is_not_entirely_null
FROM
  foo
WHERE
  column2 IS NOT NULL

答案 3 :(得分:0)

您可以使用此查询

SELECT
max(CASE WHEN t.Column1 IS NULL THEN 0 ELSE 1 END ) AS HasColumn1Data,
max(CASE WHEN t.Column2 IS NULL THEN 0 ELSE 1 END ) AS HasColumn2Data,
max(CASE WHEN t.Column3 IS NULL THEN 0 ELSE 1 END ) AS HasColumn3Data,
max(CASE WHEN t.Column4 IS NULL THEN 0 ELSE 1 END ) AS HasColumn4Data,
FROM dbo.Table AS t
WHERE t.TimeStamp BETWEEN @StartTimeStamp AND @EndTimeStamp

答案 4 :(得分:0)

您可以尝试这样的事情:

;WITH cte AS (
  SELECT * FROM dbo.Table WHERE TimeStamp BETWEEN @StartTimeStamp AND @EndTimeStamp
)
SELECT COUNT(s1.Col1) as Col1, COUNT(s2.Col2) as Col2,
  COUNT(s3.Col3) as Col3, COUNT(s4.Col4) as Col4
FROM
  (SELECT TOP 1 Col1
   FROM cte
   WHERE Col1 IS NOT NULL) s1 CROSS JOIN
  (SELECT TOP 1 Col2
   FROM cte
   WHERE Col2 IS NOT NULL) s2 CROSS JOIN
  (SELECT TOP 1 Col3
   FROM cte
   WHERE Col3 IS NOT NULL) s3 CROSS JOIN
  (SELECT TOP 1 Col4
   FROM cte
   WHERE Col4 IS NOT NULL) s4

如果所有列都不为空,则这具有潜在的优势。在这种情况下,只扫描表直到第一个非空行(但这样做4次......)。如果所有行的任何(或更糟,全部)列为空,您将获得每列的完整扫描。总而言之,如果您的预期数据确实具有值,那么这可能很有用。