如何快速制作子查询

时间:2011-02-09 13:38:10

标签: subquery aggregate-functions

对于作者概述,我们正在寻找一个查询,其中将显示所有作者,包括他们的最佳书籍。这个查询的问题在于它缺乏速度。只有大约1500位作者和查询确实生成概述目前需要20秒。

主要问题似乎是产生每人所有书籍的平均评分。 通过选择以下查询,它仍然相当快

select
    person.id as pers_id,
    person.firstname,
    person.suffix,
person.lastname,
    thriller.title,
    year(thriller.orig_pubdate) as year,
    thriller.id as thrill_id,
    count(user_rating.id) as nr,
    AVG(user_rating.rating) as avgrating
from 
    thriller 
inner join 
    thriller_form 
    on thriller_form.thriller_id = thriller.id
inner join 
    thriller_person 
    on thriller_person.thriller_id = thriller.id 
    and thriller_person.person_type_id = 1 
inner join 
    person 
    on person.id = thriller_person.person_id
left outer join
    user_rating
    on user_rating.thriller_id = thriller.id 
    and user_rating.rating_type_id = 1
where thriller.id in
    (select top 1 B.id from thriller as B
    inner join thriller_person as C on B.id=C.thriller_id
    and person.id=C.person_id)
group by
    person.firstname,
    person.suffix,
    person.lastname,
    thriller.title,
    year(thriller.orig_pubdate),
    thriller.id,
    person.id
order by
    person.lastname

但是,如果我们通过选择具有平均评级的书来使子查询更复杂,则生成结果集需要整整20秒。 然后查询如下:

select
    person.id as pers_id,
    person.firstname,
    person.suffix,
    person.lastname,
    thriller.title,
    year(thriller.orig_pubdate) as year,
    thriller.id as thrill_id,
    count(user_rating.id) as nr,
    AVG(user_rating.rating) as avgrating
from 
    thriller 
inner join 
    thriller_form 
    on thriller_form.thriller_id = thriller.id
inner join 
    thriller_person 
    on thriller_person.thriller_id = thriller.id 
    and thriller_person.person_type_id = 1 
inner join 
    person 
    on person.id = thriller_person.person_id
left outer join
    user_rating
    on user_rating.thriller_id = thriller.id 
    and user_rating.rating_type_id = 1
where thriller.id in
    (select top 1 B.id from thriller as B
    inner join thriller_person as C on B.id=C.thriller_id
    and person.id=C.person_id
    inner join user_rating as D on B.id=D.thriller_id
    group by B.id
    order by AVG(D.rating))
group by
    person.firstname,
    person.suffix,
    person.lastname,
    thriller.title,
    year(thriller.orig_pubdate),
    thriller.id,
    person.id
    order by
    person.lastname

任何人都有一个很好的建议来加快这个问题吗?

1 个答案:

答案 0 :(得分:2)

计算平均值需要进行表扫描,因为您必须对这些值求和,然后除以(相关)行的数量。这反过来意味着你正在进行大量的重新扫描;那很慢。你能计算一次平均值并储存吗?这将使您的查询使用这些预先计算的值。 (是的,它会对数据进行非规范化处理,但通常需要对性能进行非规范化处理;在性能和最小数据之间进行权衡。)

使用临时表作为平均值的存储可能是合适的。