我的数据库存储在sql server 2005 db。
中此查询执行时间不到一秒:
SELECT * FROM (
SELECT ROW_NUMBER() OVER ( ORDER BY tblOrders.orderid ) AS RowNum,
SUM(tblProducts.Price) as price
FROM tblOrders
LEFT OUTER JOIN tblOrderDetails ON tblOrders.orderid = tblOrderDetails.OrderId
LEFT OUTER JOIN tblProducts ON tblOrderDetails.ProductId = tblProducts.ProductId
GROUP BY tblOrders.orderid
) as x
where RowNum >= 21001 and RowNum < 21011
这个查询需要10秒才能执行:
SELECT * FROM (
SELECT ROW_NUMBER() OVER ( ORDER BY tblOrders.orderid ) AS RowNum,
SUM(tblProducts.Price) as price,
OrderDate
FROM tblOrders
LEFT OUTER JOIN tblOrderDetails ON tblOrders.orderid = tblOrderDetails.OrderId
LEFT OUTER JOIN tblProducts ON tblOrderDetails.ProductId = tblProducts.ProductId
GROUP BY tblOrders.orderid, tblOrders.OrderDate
) as x
where RowNum >= 21001 and RowNum < 21011
为什么会有这样的差异?
所有表都有一个名为id的列,用于保存主键。不知道为什么orderid和ProductId也存在,因为我没有设计数据库。
/百里
更新
OrderDate是日期时间
第二次更新
请记住,这三个表都有一个id列作为主键。但是,在表之间引用时会使用orderid,productid等。我不确定为什么会这样实现,但我猜错了。
tblOrders:
Id; int; no null; PK
OrderId; int; allow null
OrderDate; datetime; allow null
tblOrderDetails:
Id; int; no null; PK
OrderId; int; allow null
ProductId; int; allow null
tblProducts:
Id; int; PK; no null
ProductId; allow null
Price; money; allow null
这是否适合作为查询执行计划? -
第三次更新
这只需要一秒钟来执行 -
SELECT ROW_NUMBER() OVER ( ORDER BY tblOrders.orderid ) AS RowNum,
SUM(tblProducts.Price) as price,
OrderDate
FROM tblOrders
LEFT OUTER JOIN tblOrderDetails ON tblOrders.orderid = tblOrderDetails.OrderId
LEFT OUTER JOIN tblProducts ON tblOrderDetails.ProductId = tblProducts.ProductId
GROUP BY tblOrders.orderid, OrderDate
这只有2秒钟 -
SELECT * FROM (
SELECT ROW_NUMBER() OVER ( ORDER BY tblOrders.orderid ) AS RowNum,
SUM(tblProducts.Price) as price,
MAX(tblOrders.OrderDate) as OrderDate -- do this instead of grouping
FROM tblOrders
LEFT OUTER JOIN tblOrderDetails ON tblOrders.orderid = tblOrderDetails.OrderId
LEFT OUTER JOIN tblProducts ON tblOrderDetails.ProductId = tblProducts.ProductId
GROUP BY tblOrders.orderid ) as x
但这需要10秒钟 -
SELECT * FROM (
SELECT ROW_NUMBER() OVER ( ORDER BY tblOrders.orderid ) AS RowNum,
SUM(tblProducts.Price) as price,
MAX(tblOrders.OrderDate) as OrderDate -- do this instead of grouping
FROM tblOrders
LEFT OUTER JOIN tblOrderDetails ON tblOrders.orderid = tblOrderDetails.OrderId
LEFT OUTER JOIN tblProducts ON tblOrderDetails.ProductId = tblProducts.ProductId
GROUP BY tblOrders.orderid ) as x
where RowNum >= 21001 and RowNum < 21011
where子句增加8秒。为什么呢?
答案 0 :(得分:2)
我敢打赌,在输出列表和分组子句中包含“tblOrders.OrderDate”的甜甜圈会导致你的减速。我建议你SET STATISTICS IO ON
并运行两个查询,看看你是如何得到不同的扫描&amp;寻求每张桌子。
对于考虑OrderDate列的第二个查询,SQL引擎很可能有一个截然不同的计划,导致更多的CPU处理或(更可能)更多的磁盘IO。
答案 1 :(得分:0)
如果没有执行计划,这是无法回答的,但我可以猜到:
更新:您发布的执行计划确实非常可怕。
创建索引:
create unique nonclustered index x0 on tblOrder(orderid) include (OrderDate)
create unique nonclustered index x1 on tblProduct (productid) include (Price)
create nonclustered index x2 on tblOrderDetails(orderid, ProductId)
答案 2 :(得分:0)
什么是OrderDate?约会时间?虽然这些查询看起来非常相似,但我怀疑OrderDate包含时间信息,因此排序和分组要贵得多(并且在第二个查询的子查询中会导致更多行)。
考虑以下变化:
SELECT RowNum, price, DD = DATEADD(DAY, DD, '19000101') FROM (
SELECT ROW_NUMBER() OVER ( ORDER BY tblOrders.orderid ) AS RowNum,
SUM(tblProducts.Price) as price,
DATEDIFF(DAY, '19000101', tblOrders.OrderDate) as DD
FROM tblOrders
LEFT OUTER JOIN tblOrderDetails ON tblOrders.orderid = tblOrderDetails.OrderId
LEFT OUTER JOIN tblProducts ON tblOrderDetails.ProductId = tblProducts.ProductId
GROUP BY tblOrders.orderid, DATEDIFF(DAY, '19000101', tblOrders.OrderDate)
) as x
where RowNum >= 21001 and RowNum < 21011
ORDER BY RowNum;
在SQL Server 2008或更高版本中,您可以将其简化为CONVERT(DATE, OrderDate)
...
答案 3 :(得分:0)
如果没有实际的表结构和执行计划,我无法准确回答,但如果orderid在tblOrders中是唯一的,那么最好从group by语句中删除OrderDate,在select list中将其添加为min(tblOrders.OrderDate) as OrderDate
。它应该给出相同的结果(如果tblOrders.orderid是唯一键)但工作得更好。