如果它有所作为,我使用的是Apache Derby 10.8。
我有一个非常简单的数据库,其中包含一个充满项目的表格以及一个充满这些项目出价的表格。我想选择加入该项目的出价最高的每件商品。以下是我第一次尝试它,性能很糟糕:
select
item.id as item_id,
item.name as item_name,
item.retail_value as item_retail_value,
item.vendor as item_vendor,
bid.bid_amount as bid_amount,
bid.bidder_name as bid_bidder_name,
bid.bidder_phone as bid_bidder_phone,
bid.operator_name as bid_operator_name
from item
left outer join bid on bid.item_id = item.id and
bid.bid_amount = (select max(bid.bid_amount) from bid where bid.item_id = item.id and bid.status = 'OK')
我创建了一组测试数据,使用了282个项目,每个项目有200个出价(总共56400个出价)。上述查询大约需要30-40秒才能运行。如果我选择每个项目并手动循环选择每个项目的高出价,则只需不到一秒钟。
我已尝试为bid.bid_amount
和bid.status
列编制索引,但它没有做任何明显的事情。 SQL不是我最强的领域,所以如果有人愿意解释为什么这个查询太慢了,我真的很感激。
答案 0 :(得分:8)
查询速度很慢,因为您正在执行所谓的相关子查询 - 它为每一行运行max
。
尝试这样的事情:
select
item.id as item_id,
item.name as item_name,
item.retail_value as item_retail_value,
item.vendor as item_vendor,
bid.bid_amount as bid_amount,
bid.bidder_name as bid_bidder_name,
bid.bidder_phone as bid_bidder_phone,
bid.operator_name as bid_operator_name
from
item
left outer join (
select
item_id,
MAX(bid_amount) maxamount
from
bid
where
status = 'OK'
group by
item_id
) b1 on
item.id = b1.item_id
left outer join bid on
bid.item_id = item.id
and bid.bid_amount = b1.maxamount
这个子查询只运行一次,速度会快得多。
答案 1 :(得分:2)
您创建了一个同步(或相关)子查询。子查询是为外部表(项)的每一行执行的。
答案 2 :(得分:1)
问题是您的嵌套子查询是否在JOIN操作的每一步上运行。难怪查询性能差,CPU和磁盘可能很难工作!假设您尝试对项目表格中的每个项目进行最高确定出价,则可能需要尝试此查询:
SELECT I.id AS item_id,
I.name AS item_name,
I.retail_value AS item_retail_value,
I.vendor AS item_vendor,
B.bid_amount AS bid_amount,
B.bidder_name AS bid_bidder_name,
B.bidder_phone AS bid_bidder_phone,
B.operator_name AS bid_operator_name
FROM item AS I
LEFT OUTER JOIN (SELECT item_id, MAX(bid_amount) AS bid_amount
FROM bid
WHERE STATUS = 'OK'
GROUP BY item_id) AS _TEMP ON _TEMP.item_id = B.item_id
LEFT OUTER JOIN bid AS B ON B.item_id = _TEMP.item_id AND B.bid_amount = _TEMP.bid_amount;
答案 3 :(得分:0)
您还可以通过在bid.item_id上应用索引来提高查询的效果,因为子查询会根据item_id选择记录。