分页和分组

时间:2012-04-15 21:06:17

标签: sql sql-server sql-server-2005

我的数据库存储在sql server 2005 db。

此查询执行时间不到一秒:

SELECT * FROM ( 

SELECT  ROW_NUMBER() OVER ( ORDER BY tblOrders.orderid ) AS RowNum,   
SUM(tblProducts.Price) as price
FROM tblOrders 
LEFT OUTER JOIN tblOrderDetails ON tblOrders.orderid = tblOrderDetails.OrderId 
LEFT OUTER JOIN tblProducts ON tblOrderDetails.ProductId = tblProducts.ProductId
GROUP BY tblOrders.orderid

) as x
where RowNum >=  21001 and RowNum <  21011

这个查询需要10秒才能执行:

SELECT * FROM ( 

SELECT  ROW_NUMBER() OVER ( ORDER BY tblOrders.orderid ) AS RowNum,   
SUM(tblProducts.Price) as price, 
OrderDate
FROM tblOrders 
LEFT OUTER JOIN tblOrderDetails ON tblOrders.orderid = tblOrderDetails.OrderId 
LEFT OUTER JOIN tblProducts ON tblOrderDetails.ProductId = tblProducts.ProductId
GROUP BY tblOrders.orderid, tblOrders.OrderDate

) as x
where RowNum >=  21001 and RowNum <  21011

为什么会有这样的差异?

所有表都有一个名为id的列,用于保存主键。不知道为什么orderid和ProductId也存在,因为我没有设计数据库。

/百里

更新

OrderDate是日期时间

第二次更新

请记住,这三个表都有一个id列作为主键。但是,在表之间引用时会使用orderid,productid等。我不确定为什么会这样实现,但我猜错了。

tblOrders:
Id; int; no null; PK
OrderId; int; allow null
OrderDate; datetime; allow null

tblOrderDetails:
Id; int; no null; PK
OrderId; int; allow null
ProductId; int; allow null

tblProducts:
Id; int; PK; no null
ProductId; allow null
Price; money; allow null

这是否适合作为查询执行计划? -

enter image description here

第三次更新

这只需要一秒钟来执行 -

SELECT  ROW_NUMBER() OVER ( ORDER BY tblOrders.orderid ) AS RowNum,   
SUM(tblProducts.Price) as price, 
OrderDate
FROM tblOrders 
LEFT OUTER JOIN tblOrderDetails ON tblOrders.orderid = tblOrderDetails.OrderId 
LEFT OUTER JOIN tblProducts ON tblOrderDetails.ProductId = tblProducts.ProductId
GROUP BY tblOrders.orderid, OrderDate

这只有2秒钟 -

SELECT * FROM (
  SELECT  ROW_NUMBER() OVER ( ORDER BY tblOrders.orderid ) AS RowNum,
  SUM(tblProducts.Price) as price,
  MAX(tblOrders.OrderDate) as OrderDate  -- do this instead of grouping
FROM tblOrders
  LEFT OUTER JOIN tblOrderDetails ON tblOrders.orderid = tblOrderDetails.OrderId
  LEFT OUTER JOIN tblProducts ON tblOrderDetails.ProductId = tblProducts.ProductId
GROUP BY tblOrders.orderid  ) as x

但这需要10秒钟 -

SELECT * FROM (
  SELECT  ROW_NUMBER() OVER ( ORDER BY tblOrders.orderid ) AS RowNum,
  SUM(tblProducts.Price) as price,
  MAX(tblOrders.OrderDate) as OrderDate  -- do this instead of grouping
FROM tblOrders
  LEFT OUTER JOIN tblOrderDetails ON tblOrders.orderid = tblOrderDetails.OrderId
  LEFT OUTER JOIN tblProducts ON tblOrderDetails.ProductId = tblProducts.ProductId
GROUP BY tblOrders.orderid  ) as x
where RowNum >=  21001 and RowNum <  21011

where子句增加8秒。为什么呢?

4 个答案:

答案 0 :(得分:2)

我敢打赌,在输出列表和分组子句中包含“tblOrders.OrderDate”的甜甜圈会导致你的减速。我建议你SET STATISTICS IO ON并运行两个查询,看看你是如何得到不同的扫描&amp;寻求每张桌子。

对于考虑OrderDate列的第二个查询,SQL引擎很可能有一个截然不同的计划,导致更多的CPU处理或(更可能)更多的磁盘IO。

答案 1 :(得分:0)

如果没有执行计划,这是无法回答的,但我可以猜到:

  • 附加列可能会阻止使用索引
  • 慢查询的基数非常高
  • OrderDate的统计信息以某种方式过时(exec sp_updatestats)

更新:您发布的执行计划确实非常可怕。

创建索引:

create unique nonclustered index x0 on tblOrder(orderid) include (OrderDate)
create unique nonclustered index x1 on tblProduct (productid) include (Price)
create nonclustered index x2 on tblOrderDetails(orderid, ProductId)

答案 2 :(得分:0)

什么是OrderDate?约会时间?虽然这些查询看起来非常相似,但我怀疑OrderDate包含时间信息,因此排序和分组要贵得多(并且在第二个查询的子查询中会导致更多行)。

考虑以下变化:

SELECT RowNum, price, DD = DATEADD(DAY, DD, '19000101') FROM (     
SELECT  ROW_NUMBER() OVER ( ORDER BY tblOrders.orderid ) AS RowNum,   
SUM(tblProducts.Price) as price, 
DATEDIFF(DAY, '19000101', tblOrders.OrderDate) as DD
FROM tblOrders 
LEFT OUTER JOIN tblOrderDetails ON tblOrders.orderid = tblOrderDetails.OrderId 
LEFT OUTER JOIN tblProducts ON tblOrderDetails.ProductId = tblProducts.ProductId
GROUP BY tblOrders.orderid, DATEDIFF(DAY, '19000101', tblOrders.OrderDate)

) as x
where RowNum >=  21001 and RowNum <  21011
ORDER BY RowNum;

在SQL Server 2008或更高版本中,您可以将其简化为CONVERT(DATE, OrderDate) ...

答案 3 :(得分:0)

如果没有实际的表结构和执行计划,我无法准确回答,但如果orderid在tblOrders中是唯一的,那么最好从group by语句中删除OrderDate,在select list中将其添加为min(tblOrders.OrderDate) as OrderDate。它应该给出相同的结果(如果tblOrders.orderid是唯一键)但工作得更好。