Question

对于作者概述，我们正在寻找一个查询，其中将显示所有作者，包括他们的最佳书籍。这个查询的问题在于它缺乏速度。只有大约1500位作者和查询确实生成概述目前需要20秒。

主要问题似乎是产生每人所有书籍的平均评分。通过选择以下查询，它仍然相当快

select
    person.id as pers_id,
    person.firstname,
    person.suffix,
person.lastname,
    thriller.title,
    year(thriller.orig_pubdate) as year,
    thriller.id as thrill_id,
    count(user_rating.id) as nr,
    AVG(user_rating.rating) as avgrating
from 
    thriller 
inner join 
    thriller_form 
    on thriller_form.thriller_id = thriller.id
inner join 
    thriller_person 
    on thriller_person.thriller_id = thriller.id 
    and thriller_person.person_type_id = 1 
inner join 
    person 
    on person.id = thriller_person.person_id
left outer join
    user_rating
    on user_rating.thriller_id = thriller.id 
    and user_rating.rating_type_id = 1
where thriller.id in
    (select top 1 B.id from thriller as B
    inner join thriller_person as C on B.id=C.thriller_id
    and person.id=C.person_id)
group by
    person.firstname,
    person.suffix,
    person.lastname,
    thriller.title,
    year(thriller.orig_pubdate),
    thriller.id,
    person.id
order by
    person.lastname

但是，如果我们通过选择具有平均评级的书来使子查询更复杂，则生成结果集需要整整20秒。然后查询如下：

select
    person.id as pers_id,
    person.firstname,
    person.suffix,
    person.lastname,
    thriller.title,
    year(thriller.orig_pubdate) as year,
    thriller.id as thrill_id,
    count(user_rating.id) as nr,
    AVG(user_rating.rating) as avgrating
from 
    thriller 
inner join 
    thriller_form 
    on thriller_form.thriller_id = thriller.id
inner join 
    thriller_person 
    on thriller_person.thriller_id = thriller.id 
    and thriller_person.person_type_id = 1 
inner join 
    person 
    on person.id = thriller_person.person_id
left outer join
    user_rating
    on user_rating.thriller_id = thriller.id 
    and user_rating.rating_type_id = 1
where thriller.id in
    (select top 1 B.id from thriller as B
    inner join thriller_person as C on B.id=C.thriller_id
    and person.id=C.person_id
    inner join user_rating as D on B.id=D.thriller_id
    group by B.id
    order by AVG(D.rating))
group by
    person.firstname,
    person.suffix,
    person.lastname,
    thriller.title,
    year(thriller.orig_pubdate),
    thriller.id,
    person.id
    order by
    person.lastname

任何人都有一个很好的建议来加快这个问题吗？

Answer 1

计算平均值需要进行表扫描，因为您必须对这些值求和，然后除以（相关）行的数量。这反过来意味着你正在进行大量的重新扫描;那很慢。你能计算一次平均值并储存吗？这将使您的查询使用这些预先计算的值。（是的，它会对数据进行非规范化处理，但通常需要对性能进行非规范化处理;在性能和最小数据之间进行权衡。）

使用临时表作为平均值的存储可能是合适的。

如何快速制作子查询

1 个答案: