我们有一个带有2tb SSD和30GB RAM的MySQL 5.5 RDS实例。最近我们注意到一些由索引支持的特定查询会突然花费大量时间(30秒以上)并最终超时。
查询:
select
count(distinct case when saletype = ? then sale.id end) as total_sales,
sum(final_tax) as total_tax_amount,
sum(final_total_ex_tax) as total_ex_tax_amount,
sum(final_tax + final_total_ex_tax) as total_amount,
max(final_tax + final_total_ex_tax) as best_sale
from platform_staff
join sale on sale.userid = platform_staff.id
join sale_lines on sale.numeric_id = sale_lines.sale_id
where ((platform_staff.merchant_id = ?)
and (saledate between ? and ?))
and (sale.site_id in ?))
解释:
id select_type table type possible_keys key key_len ref rows Extra
1 SIMPLE sale ref numeric_id_UNIQUE,fk_sale_user1,index_sale_site_id,index_sale_date index_sale_site_id 5 const 172426 Using where
1 SIMPLE platform_staff eq_ref PRIMARY,id_unique,fk_staff_merchant1 PRIMARY 108 kountadata001.sale.UserID 1 Using where
1 SIMPLE sale_lines ref index_sale_line_sale_id index_sale_line_sale_id 4 kountadata001.sale.numeric_id 1 Using where
有趣的是,如果我将查询分解成碎片并多次运行较小的查询,它最终将能够执行原始查询而不会超时。然后我等了几分钟,不得不再次完成这个过程。
这会告诉我,页面在使用后不久就会从内存中弹出,缓冲池统计信息只能说明这么多:
----------------------
BUFFER POOL AND MEMORY
----------------------
Total memory allocated 24069734400; in additional pool allocated 0
Dictionary memory allocated 2156919
Buffer pool size 1435456
Free buffers 0
Database pages
Old database pages 504601
Modified db pages 126657
Pending reads 0
Pending writes: LRU 0, flush list 0, single page 0
Pages made young 11708661104, not young 0
788.62 youngs/s, 0.00 non-youngs/s
Pages read 2360851192, created 69081262, written 4068758211
285.58 reads/s, 3.63 creates/s, 178.86 writes/s
Buffer pool hit rate 1000 / 1000, young-making rate 2 / 1000 not 0 / 1000
Pages read ahead 0.00/s, evicted without access 0.00/s, Random read ahead 0.00/s
LRU len: 1367017, unzip_LRU len: 0
I/O sum[23143]:cur[9], unzip sum[0]:cur[0]
尽管有其他表的读/写活动,但是可以告诉InnoDB将特定表的页面保留在内存中吗?