Count(*)与sys.partitions中的行不同

时间:2014-08-13 07:18:01

标签: sql sql-server database-schema

我正在使用以下查询来获取有关数据库中所有表的信息:

SELECT 
    t.NAME AS TableName,
    i.name as indexName,
    sum(p.rows) as RowCounts,
    sum(a.total_pages) as TotalPages, 
    sum(a.used_pages) as UsedPages, 
    sum(a.data_pages) as DataPages,
    (sum(a.total_pages) * 8) / 1024 as TotalSpaceMB, 
    (sum(a.used_pages) * 8) / 1024 as UsedSpaceMB, 
    (sum(a.data_pages) * 8) / 1024 as DataSpaceMB
FROM 
    sys.tables t
INNER JOIN      
    sys.indexes i ON t.OBJECT_ID = i.object_id
INNER JOIN 
    sys.partitions p ON i.object_id = p.OBJECT_ID AND i.index_id = p.index_id
INNER JOIN 
    sys.allocation_units a ON p.partition_id = a.container_id
WHERE 
    t.NAME NOT LIKE 'dt%' AND
    i.OBJECT_ID > 255 AND   
    i.index_id <= 1
GROUP BY 
    t.NAME, i.object_id, i.index_id, i.name 
ORDER BY 
    object_name(i.object_id)

问题是,对于某些表,它会报告与我不同的行数:

select count(*) FROM someTable

为什么?

修改

第一个查询返回更高的计数:

First: 1 240 464
Second:  413 496

5 个答案:

答案 0 :(得分:2)

来自the sys.partitions documentation

  

rows bigint 此分区中的近似行数。

(强调我的)。系统视图不会保持表中的点数行数。想想它会带来什么以及它将为所有插入/删除语句添加多少开销。如果我是一个博彩人,我会说它正在做的事情是聚集索引或堆中的页数,这是一个便宜得多的操作。不过,那纯粹是推测性的。

答案 1 :(得分:2)

问题是每个分区有多个allocation_unit,所以同一个分区可以出现多次,因此总和(p.rows)最终会多次计算同一个分区,所以你得到正确行数的倍数。

以下是我解决问题的方法: (请注意,我的查询与您的查询不同,我的列略有不同,使用的是Kb而不是Mb,但这个想法是一样的)

    SELECT 
        s.Name + '.' + t.name AS table_name,
        (select sum(p2.rows)
            from sys.indexes i2 inner join sys.partitions p2 ON i2.object_id = p2.OBJECT_ID AND i2.index_id = p2.index_id
            where i2.object_id = t.object_id and i2.object_id > 255 and (i2.index_id = 0 or i2.index_id = 1)
        ) as total_rows,
        SUM(CASE WHEN (i.index_id=0) OR (i.index_id=1) THEN a.total_pages * 8 ELSE 0 END) AS data_size_kb,
        SUM(CASE WHEN (i.index_id=0) OR (i.index_id=1) THEN a.used_pages * 8 ELSE 0 END) AS data_used_kb,
        SUM(CASE WHEN (i.index_id=0) OR (i.index_id=1) THEN 0 ELSE a.total_pages * 8 END) AS index_size_kb,
        SUM(CASE WHEN (i.index_id=0) OR (i.index_id=1) THEN 0 ELSE a.used_pages * 8 END) AS index_used_kb,
        SUM(a.total_pages) * 8 AS total_size_kb, 
        SUM(a.used_pages) * 8 AS total_used_kb,
        SUM(a.used_pages) * 100 / CASE WHEN SUM(a.total_pages) = 0 THEN 1 ELSE SUM(a.total_pages) END AS percent_full
    FROM 
        sys.tables t
    INNER JOIN 
        sys.schemas s ON s.schema_id = t.schema_id
    INNER JOIN      
        sys.indexes i ON t.OBJECT_ID = i.object_id
    INNER JOIN 
        sys.partitions p ON i.object_id = p.OBJECT_ID AND i.index_id = p.index_id
    INNER JOIN 
        sys.allocation_units a ON p.partition_id = a.container_id
    WHERE 
        t.is_ms_shipped = 0 AND i.OBJECT_ID > 255 
    GROUP BY 
        t.object_id, t.Name, s.Name
    ORDER BY SUM(a.total_pages) DESC

答案 2 :(得分:1)

您是否查看了有关sys.allocation_units视图的帮助文章?显然,container_id字段比看起来多一点。尝试将其添加到where部分:

and a.type = 2

答案 3 :(得分:0)

在SQL Server 2016中,为了解决count(*)sys.partitions的不匹配问题,我在主键上执行了索引重建。幸运的是,该表只有240万行,因此只要我拥有标准版,它就花了这么长时间,所以无法在线重建。

答案 4 :(得分:-1)

内部联接将导致过滤掉不匹配的行。组也会影响您的行数,因为它们可以组合行。这两个条件导致聚合查询的行数少于简单计数(*)。

我特别看到你在询问sys.partitions表。可能的解释是,在给定i.object_id = p.OBJECT_ID和i.index_id = p.index_id的匹配条件的情况下,sys.indexes表中的每一行都不匹配。试试这个:

Select 
  count(*) 
from 
  sys.partitions p
LEFT JOIN
  sys.indexes i ON i.object_id = p.OBJECT_ID AND i.index_id = p.index_id

然后您可能会看到您期望的计数。删除计数功能只需Select * ...即可找到不匹配的行。