我在SQL Server中有一个具有490列的表,今天我需要添加更多表。我有一个API可以从外部系统填充此表,目前,由于该表中有〜550,000行,因此大约需要16个小时才能同步。我需要计算每列中正在使用的行数,以查看是否有可以删除的地方。
我已经研究了一段时间,并采取了最后的努力。我尝试了几种不同的方法,但是没有什么能满足我的需求。我知道我可以执行COUNT(column_name),但是有490列,这实际上是不可行的。
因此,我目前正在使用sys.columns表获取所述表中的行的列表,然后使用外部应用,其中使用来自表的COUNT(*)。这是一种工作方式,但显然只是每行再次向我返回表中的总行数。
我认为我需要将Count(*)替换为COUNT(sys.columns.name),但这也不起作用,它返回“ APPLY右侧的聚合无法引用左侧的列侧。”错误。
我认为目前最接近的代码如下,但距离我有一百万英里。
SELECT
name as 'Column',
Counter.total
FROM sys.columns WITH (NOLOCK)
OUTER APPLY
(
SELECT TOP 1
COUNT(*) as total
FROM lead WITH (nolock)
) as Counter
WHERE sys.columns.object_id = 544720993
这将返回以下内容-
Column | total
______________________
Column1 | 512345
Column2 | 512345
Column3 | 512345
Column4 | 512345
Column5 | 512345
但是,在理想的世界中,我想要以下内容
Column | total
______________________
Column1 | 512345 --(meaning no nulls in this column)
Column2 | 435765 --(mean some nulls in this column)
Column3 | 123423
Column4 | 76 --(meaning only 73 non nulls on this column)
Column5 | 0 --(meaning every row is null in this column)
谢谢您的时间!
答案 0 :(得分:3)
样本数据
CREATE TABLE [dbo].[Tp](
[a] [char](2) NULL,
[b] [char](2) NULL,
[c] [char](2) NULL
) ON [PRIMARY]
GO
INSERT INTO [Tp] ([a],[b],[c])VALUES('a','a','a')
INSERT INTO [Tp] ([a],[b],[c])VALUES('1','1','1')
INSERT INTO [Tp] ([a],[b],[c])VALUES('2','2','2')
INSERT INTO [Tp] ([a],[b],[c])VALUES(NULL,'9',NULL)
INSERT INTO [Tp] ([a],[b],[c])VALUES('3','3','3')
INSERT INTO [Tp] ([a],[b],[c])VALUES('4','4','4')
INSERT INTO [Tp] ([a],[b],[c])VALUES(NULL,NULL,NULL)
INSERT INTO [Tp] ([a],[b],[c])VALUES(NULL,'7',NULL)
INSERT INTO [Tp] ([a],[b],[c])VALUES(NULL,NULL,NULL)
INSERT INTO [Tp] ([a],[b],[c])VALUES('8','8','8')
INSERT INTO [Tp] ([a],[b],[c])VALUES('9','9','9')
INSERT INTO [Tp] ([a],[b],[c])VALUES(NULL,NULL,NULL)
INSERT INTO [Tp] ([a],[b],[c])VALUES('','','')
INSERT INTO [Tp] ([a],[b],[c])VALUES('','','')
INSERT INTO [Tp] ([a],[b],[c])VALUES('','5','')
INSERT INTO [Tp] ([a],[b],[c])VALUES('2','','')
SELECT * FROM [Tp]
动态Sql脚本以获取预期结果
DECLARE @ColumnCount nvarchar(max),
@Sql nvarchar(max)
SELECT @Sql = STUFF((SELECT ' UNION ALL '+ ' '+'SELECT '''+TABLE_NAME+''' AS TABLE_NAME,'+''''+COLUMN_NAME+''''+' AS ColumName'+',SUM(CASE WHEN '+COLUMN_NAME+' IS NULL THEN 1 ELSE 0 END) As Countof_nulls
,SUM(CASE WHEN ISNULL(NULLIF('+COLUMN_NAME+',''''),''1'')=''1'' THEN 1 ELSE 0 END) As CountOf_EmptySpace
,COUNT('+COLUMN_NAME+') As Count_not_nulls
FROM '+TABLE_NAME
FROM
INFORMATION_SCHEMA.COLUMNS
WHERE TABLE_NAME ='Tp' --Enter your table in the query
FOR XML PATH (''), TYPE).value('.', 'VARCHAR(MAX)'),1,10,'')
EXEC (@Sql)
结果
TABLE_NAME ColumName Countof_nulls CountOf_EmptySpace Count_not_nulls
***************************************************************************
Tp a 5 9 11
Tp b 3 7 13
Tp c 5 10 11
答案 1 :(得分:1)
您可以使用带有动态SQL的游标,该游标会在临时表中插入每个COUNT
检查。
您可以控制架构,表和列以使用光标的SELECT
进行检查。
IF OBJECT_ID('tempdb..#ColumnResults') IS NOT NULL
DROP TABLE #ColumnResults
CREATE TABLE #ColumnResults (
SchemaName VARCHAR(100),
TableName VARCHAR(100),
ColumnName VARCHAR(100),
TotalRows INT,
NotNullAmount INT)
DECLARE @SchemaName VARCHAR(100)
DECLARE @TableName VARCHAR(100)
DECLARE @ColumnName VARCHAR(100)
DECLARE ColumnCursor CURSOR FOR
SELECT
QUOTENAME(T.TABLE_SCHEMA),
QUOTENAME(T.TABLE_NAME),
QUOTENAME(T.COLUMN_NAME)
FROM
INFORMATION_SCHEMA.COLUMNS AS T
WHERE
T.TABLE_NAME = 'YourTableName' AND -- Filter here the table you want to check
T.TABLE_SCHEMA = 'YourTableSchema' -- Filter here the schema you want to check
ORDER BY
T.TABLE_SCHEMA,
T.TABLE_NAME,
T.COLUMN_NAME
OPEN ColumnCursor
FETCH NEXT FROM ColumnCursor INTO
@SchemaName,
@TableName,
@ColumnName
WHILE @@FETCH_STATUS = 0
BEGIN
DECLARE @DynamicSQL VARCHAR(MAX) = '
INSERT INTO #ColumnResults (
SchemaName,
TableName,
ColumnName,
TotalRows,
NotNullAmount)
SELECT
SchemaName = ''' + @SchemaName + ''',
TableName = ''' + @TableName + ''',
ColumnName = ''' + @ColumnName + ''',
TotalRows = COUNT(1),
NotNullAmount = COUNT(' + @ColumnName + ')
FROM
' + @SchemaName + '.' + @TableName + ' AS T'
-- PRINT (@DynamicSQL)
EXEC (@DynamicSQL)
FETCH NEXT FROM ColumnCursor INTO
@SchemaName,
@TableName,
@ColumnName
END
CLOSE ColumnCursor
DEALLOCATE ColumnCursor
SELECT
C.*
FROM
#ColumnResults AS C
ORDER BY
C.SchemaName,
C.TableName,
C.ColumnName
您可以注释EXEC
并取消注释PRINT
,以检查执行前创建的动态SQL。
请注意,这实际上将为每个列执行一个SELECT
,而不是为表中的所有列执行SELECT
。您可以对动态SQL进行一些改动,以便在检查所有列时每个表可以使用一次,但是我发现这种方法更加整洁,并且能够以相同的方式跨模式和表进行工作。