我想编写一个TSQL查询,该查询独立地检查表中的一组列,以查看哪些列包含至少一个非空值。每列的检查应相应地返回T / F(1/0)。
首先想到的是使用COUNT
聚合函数。由于COUNT(expression)
从结果总数中排除了空值,因此如果COUNT
是> 0,有非空数据。
这似乎有点笨拙,因为它必须统计所有数据。我真的只需要知道每列中是否至少有一个非空值:
SELECT
CAST(CASE WHEN COUNT(t.Column1) > 0 THEN 1 ELSE 0 END AS BIT) AS HasColumn1Data,
CAST(CASE WHEN COUNT(t.Column2) > 0 THEN 1 ELSE 0 END AS BIT) AS HasColumn2Data,
CAST(CASE WHEN COUNT(t.Column3) > 0 THEN 1 ELSE 0 END AS BIT) AS HasColumn3Data,
CAST(CASE WHEN COUNT(t.Column4) > 0 THEN 1 ELSE 0 END AS BIT) AS HasColumn4Data
FROM dbo.Table AS t
WHERE t.TimeStamp BETWEEN @StartTimeStamp AND @EndTimeStamp
任何可能更优化的想法?
答案 0 :(得分:0)
如果列上有索引,则以下内容可能会更快:
select (case when exists (select 1
from table t
where t.TimeStamp BETWEEN @StartTimeStamp and @EndTimeStamp and
column1 is not null
)
then 1 else 0 end) as HasColumn1Data,
(case when exists (select 1
from table t
where t.TimeStamp BETWEEN @StartTimeStamp and @EndTimeStamp and
column2 is not null
)
then 1 else 0 end) as HasColumn2Data,
(case when exists (select 1
from table t
where t.TimeStamp BETWEEN @StartTimeStamp and @EndTimeStamp and
column3 is not null
)
then 1 else 0 end) as HasColumn3Data,
(case when exists (select 1
from table t
where t.TimeStamp BETWEEN @StartTimeStamp and @EndTimeStamp and
column4 is not null
)
then 1 else 0 end) as HasColumn4Data;
没有索引,这将是大约4次全表扫描(诚然,在第一个非NULL值时截断),因此它可能比group by
答案 1 :(得分:0)
这可能最终会变得更加麻烦,但使用EXISTS
代替COUNT
可能更为理想:
SELECT CAST(CASE WHEN EXISTS(SELECT * FROM Table t WHERE t.Column1 IS NOT NULL AND t.TimeStamp BETWEEN @StartTimeStamp AND @EndTimeStamp) THEN 1 ELSE 0 END AS BIT) AS HasColumn1Data,
CAST(CASE WHEN EXISTS(SELECT * FROM Table t WHERE t.Column2 IS NOT NULL AND t.TimeStamp BETWEEN @StartTimeStamp AND @EndTimeStamp) THEN 1 ELSE 0 END AS BIT) AS HasColumn2Data,
CAST(CASE WHEN EXISTS(SELECT * FROM Table t WHERE t.Column3 IS NOT NULL AND t.TimeStamp BETWEEN @StartTimeStamp AND @EndTimeStamp) THEN 1 ELSE 0 END AS BIT) AS HasColumn3Data,
CAST(CASE WHEN EXISTS(SELECT * FROM Table t WHERE t.Column4 IS NOT NULL AND t.TimeStamp BETWEEN @StartTimeStamp AND @EndTimeStamp) THEN 1 ELSE 0 END AS BIT) AS HasColumn4Data
答案 2 :(得分:0)
我会将查询转动为仅生成字段名称以及它们是否为非空:
==>type labels.bat
@ECHO %1>NUL
if ""=="" (
rem comment
@echo a rem comment precedes this echo command
)
if ""=="" (
:: comment
@echo a label-like comment precedes this echo command
)
if ""=="" (
:label
@echo a label precedes this echo command
)
if "" == "" (@echo a simple echo, no comments)
@goto :eof
==>labels.bat off
a rem comment precedes this echo command
'@echo' is not recognized as an internal or external command,
operable program or batch file.
'@echo' is not recognized as an internal or external command,
operable program or batch file.
a simple echo, no comments
==>
...
顺便说一句,你应该可以用以下内容自动生成上述查询:SELECT
'COL1' AS column_name,
CONVERT( BIT, COUNT( 1 ) ) AS is_not_entirely_null
FROM
foo
WHERE
column1 IS NOT NULL
UNION
SELECT
'COL2' AS column_name,
CONVERT( BIT, COUNT( 1 ) ) AS is_not_entirely_null
FROM
foo
WHERE
column2 IS NOT NULL
答案 3 :(得分:0)
您可以使用此查询
SELECT
max(CASE WHEN t.Column1 IS NULL THEN 0 ELSE 1 END ) AS HasColumn1Data,
max(CASE WHEN t.Column2 IS NULL THEN 0 ELSE 1 END ) AS HasColumn2Data,
max(CASE WHEN t.Column3 IS NULL THEN 0 ELSE 1 END ) AS HasColumn3Data,
max(CASE WHEN t.Column4 IS NULL THEN 0 ELSE 1 END ) AS HasColumn4Data,
FROM dbo.Table AS t
WHERE t.TimeStamp BETWEEN @StartTimeStamp AND @EndTimeStamp
答案 4 :(得分:0)
您可以尝试这样的事情:
;WITH cte AS (
SELECT * FROM dbo.Table WHERE TimeStamp BETWEEN @StartTimeStamp AND @EndTimeStamp
)
SELECT COUNT(s1.Col1) as Col1, COUNT(s2.Col2) as Col2,
COUNT(s3.Col3) as Col3, COUNT(s4.Col4) as Col4
FROM
(SELECT TOP 1 Col1
FROM cte
WHERE Col1 IS NOT NULL) s1 CROSS JOIN
(SELECT TOP 1 Col2
FROM cte
WHERE Col2 IS NOT NULL) s2 CROSS JOIN
(SELECT TOP 1 Col3
FROM cte
WHERE Col3 IS NOT NULL) s3 CROSS JOIN
(SELECT TOP 1 Col4
FROM cte
WHERE Col4 IS NOT NULL) s4
如果所有列都不为空,则这具有潜在的优势。在这种情况下,只扫描表直到第一个非空行(但这样做4次......)。如果所有行的任何(或更糟,全部)列为空,您将获得每列的完整扫描。总而言之,如果您的预期数据确实具有值,那么这可能很有用。