如何获取信息表中有多少行具有特定数量的空值? 我想得到这样的东西:
Number of nulls | Number of rows
0 | 10
1 | 4
2 | 11
动机:
我需要这个用于数据挖掘。例如,如果我的观察结果几乎在所有列中都有空值,那么我需要摆脱这种观察,但也可能存在许多观测值具有较小的空值的情况,这是可以接受的。
答案 0 :(得分:4)
使用SQL,您将不得不求助于令人不快的代码,例如:
SELECT CASE WHEN column1 IS NULL THEN 1 ELSE 0 END
+ CASE WHEN column2 IS NULL THEN 1 ELSE 0 END
+ ... AS num_nulls,
COUNT(*) as num_rows
FROM table
GROUP BY num_nulls;
另请注意,并非所有SQL方言都支持在GROUP BY
子句中通过alas引用计算列,因此您最终可能会遇到更加丑陋的查询。不用说,您还必须为每个表进行不同的查询。您当然可以使用一些INFORMATION_SCHEMA
voodoo ...
答案 1 :(得分:1)
类似的东西:
select MyCol, Count(MyCountCol) from
(select 0 +
case when Col1 is null then 1 else 0 end
+ case when Col2 is null then 1 else 0 end
-- + whatever other col names are in your table
as MyCountCol
from MyTable)
group by MyCol
答案 2 :(得分:1)
对于SQL Server 2008,您可以执行
DECLARE @T TABLE
(
pk INT PRIMARY KEY,
c1 INT,
c2 INT,
c3 VARCHAR(10)
)
INSERT INTO @T
SELECT 1,1,1,'foo'
UNION ALL
SELECT 2,1,NULL,'bar'
UNION ALL
SELECT 3,NULL,NULL,NULL
UNION ALL
SELECT 4,NULL,NULL,NULL
SELECT Num AS [Number of Nulls],
COUNT(*) AS [Number of rows]
FROM @T
CROSS APPLY (SELECT COUNT(*) - COUNT(c) FROM (VALUES(cast(c1 as SQL_VARIANT)),
(c2),
(c3)) T (c)) CA(Num)
GROUP BY Num