Question

我还需要有关分页的帮助以及对多个表使用UNION ALL：

如何在使用UNION ALL连接多个表并仅返回特定行数时实现优化分页...

declare @startRow int
declare @PageCount int

set @startRow = 0
set @PageCount = 20

set rowcount @PageCount

select Row_Number() OVER(Order by col1) as RowNumber, col1, col2
from
(
    select col1, col2 from table1 where datetimeCol between (@dateFrom and @dateTo)
    union all
    select col1, col2 from table2 where datetimeCol between (@dateFrom and @dateTo)
    union all
    select col1, col2 from table3 where datetimeCol between (@dateFrom and @dateTo)
    union all
    select col1, col2 from table4 where datetimeCol between (@dateFrom and @dateTo)
    union all
    select col1, col2 from table5 where datetimeCol between (@dateFrom and @dateTo)
) as tmpTable
where RowNumber > @startRow

表3,4，＆amp;表5和表5中有大量的行（数百万行）。 2可能只有几千行。

如果startRow是＆＃34; 0＆＃34;，我只期望第1行到第20行（来自表1）的数据。我得到了正确的结果，但在剩余的表上有很高的开销，而sql server尝试所有的数据并过滤它....

@dateFrom和@dateTo的间隔时间越长，我的查询在尝试从整个结果集中只检索几行时显着变慢

请帮助我如何使用类似的逻辑实现一个简单但更好的方法。：（

Answer 1

考虑使用OFFSET FETCH子句（从MSSQL 2012开始）：

declare @startRow int
declare @PageCount int

set @startRow = 0
set @PageCount = 20


select col1, col2
from
(
    select col1, col2 from table1 where datetimeCol between (@dateFrom and @dateTo)
    union all
    select col1, col2 from table2 where datetimeCol between (@dateFrom and @dateTo)
    union all
    select col1, col2 from table3 where datetimeCol between (@dateFrom and @dateTo)
    union all
    select col1, col2 from table4 where datetimeCol between (@dateFrom and @dateTo)
    union all
    select col1, col2 from table5 where datetimeCol between (@dateFrom and @dateTo)
) as tmpTable
order by col1
offset @startRow rows
fetch next @PageCount rows only

我还想在这里提一下，为什么这个查询总是需要O（n * log（n））时间要执行此查询，数据库需要：

将多个列表合并为一个列表 - 每个表需要O（n）时间，其中n - 表中的总行数;
按col1排序列表 - 取O（n * log（n）），其中n - 是总行数
按排序顺序遍历列表，跳过@startRow行，接下来的@PageCount行。

如果此查询的效果仍然很差并且您希望增加，请尝试：

根据所有表格中的col1创建clustred索引
在所有表格中基于col1创建非群集索引，**包括您要在选择列表中输出的所有其他列**。

Answer 2

我认为您的特定用例可以从{{3}中描述的通常称为“搜索方法”的方法中获益，而不是应用基于经典OFFSET的分页（note that SQL Server 2012 now natively supports it）。 }。您的查询将如下所示。

select top 20 col1, col2
from
(
    select col1, col2 from t1 where datetimeCol between (@dateFrom and @dateTo)
    union all
    select col1, col2 from t2 where datetimeCol between (@dateFrom and @dateTo)
    union all
    select col1, col2 from t3 where datetimeCol between (@dateFrom and @dateTo)
    union all
    select col1, col2 from t4 where datetimeCol between (@dateFrom and @dateTo)
    union all
    select col1, col2 from t5 where datetimeCol between (@dateFrom and @dateTo)
) as tmpTable
where (col1 > @lastValueForCol1)
   or (col1 = @lastValueForCol1 and col2 > @lastValueForCol2)
order by col1, col2

@lastValueForCol1和@lastValueForCol2值是上一页中最后一条记录的相应值。这允许您获取“下一页”。如果ORDER BY方向为DESC，则只需使用<即可。如果(col1, col2)中的tmpTable不是全局唯一的，您可能需要在查询以及WHERE和ORDER BY子句中添加另一列，以避免丢失页面之间的记录。

使用上述方法，您无法在未先读取前40条记录的情况下立即跳转到第3页。但通常情况下，你不想跳得那么远。相反，您可以获得更快的查询，该查询可能能够在固定时间内获取数据，具体取决于您的索引。此外，无论基础数据是否发生变化，您的页面都将保持“稳定”状态（例如，在第1页上，当您在第4页时）。

注意，“搜索方法”也称为this blog post here。

索引

虽然使用“搜索方法”进行分页总是比使用OFFSET更快，但您仍应确保在每个表中都为(col1, col2)编制索引！

Answer 3

由于表是在分页的结果集中排序的（Union ALL不排序），因此没有理由从所有5个表中进行选择。您应该将代码更改为：

从表1查询。查看您是否有足够的记录。
如果不是表2中的查询，依此类推。

根据每次查询的记录数管理偏移计数。这样，您只需查询所需的表。

您甚至可以通过根据过滤器选择表中的记录数来进行优化，以了解是否需要从中查询任何数据。因此，如果您想要记录30-50，并且table1中只有20个匹配的记录，您可以完全跳过它。

Answer 4

select col1, col2 from table1 where datetimeCol between (@dateFrom and @dateTo)
union all
select col1, col2 from table2 where datetimeCol between (@dateFrom and @dateTo)
union all
select col1, col2 from table3 where datetimeCol between (@dateFrom and @dateTo)
union all
select col1, col2 from table4 where datetimeCol between (@dateFrom and @dateTo)
union all
select col1, col2 from table5 where datetimeCol between (@dateFrom and @dateTo)

如果在用于分页的排序键上有一个索引，那么

基本上和普通表一样有效。这通常会产生一个查询计划，其中所有表都是合并连接的。合并连接是一种流操作。它的成本与绘制的行数成比例，而不是表中的行数。

在SQL Server中按行号进行分页始终通过枚举行开始结束直到达到所需窗口为止。无论是从表中还是从多个合并表中绘制行，都不会产生根本区别。

因此，快速实现此目标的好机会是创建一个覆盖在col1上的覆盖索引。遗憾的是，无法同时为between (@dateFrom and @dateTo)编制索引。所以你必须尝试两种索引策略并选择最有效的方法。

Answer 5

您的数据库设计可能存在问题，因为您有5个类似的表。但除此之外，您可以将UNION ALL查询实现为永久表或具有适当索引的temp＃-table，最后使用ROW_NUMBER（）子句对实现数据集进行分页。

SQL Server，为多个表使用UNION ALL然后分页实现

5 个答案:

索引