如果需要,我可以添加更多详细信息,但是基本上会遇到查询大表(1亿多行)的问题。我的查询需要几分钟才能完成。大多数数据是不会更改的以前的数据(即去年的销售数据)。我在其他报告中使用了这些数据,并且能够每晚将数据“汇总”到一个按月,年等分组的新表中。但是,我要构建的报告有很多动态的元素,例如自定义时间/日期选择器,这些元素使我很难进行此类汇总。
我想我的问题是,有人对大型表和动态查询有很多经验吗?
我也已尽我所能进行研究,并确保我的数据库设备齐全。目前有16gb的ram和12gb的InnoDB缓冲池。 (我不是这里的专家,所以让我知道是否还有其他需要寻找的东西)。
感谢任何人的帮助,如果您需要有关我的用例的特定信息,请再次告诉我。
SELECT mainaccounts.account_id AS 'ACCOUNTID',
(
SELECT name
FROM activitysettings
WHERE org_id = '5a1da86ed6ea7c6000e45e82'
AND id = '5a1da86ed6ea7c6000e45e8e' ) AS 'ACTIVITYNAME',
(
SELECT Count(DISTINCT a.id)
FROM activity a
WHERE a.org_id = '5a1da86ed6ea7c6000e45e82'
AND (
a.started_at BETWEEN '2018-01-01' AND '2018-02-01')
AND a.status = true
AND a.account_id = mainaccounts.account_id
GROUP BY a.account_id ) AS 'ACTIVITYTHIS',
(
SELECT Count(DISTINCT b.id)
FROM activity b
WHERE b.org_id = '5a1da86ed6ea7c6000e45e82'
AND (
b.started_at BETWEEN '2017-01-01' AND '2017-02-01')
AND b.status = true
AND b.account_id = mainaccounts.account_id
AND b.activity_id = '5a1da86ed6ea7c6000e45e8e'
GROUP BY b.account_id ) AS 'ACTIVITYLAST',
ifnull(
(
SELECT Sum(s1.volumece)
FROM sales s1
WHERE s1.org_id = '5a1da86ed6ea7c6000e45e82'
AND (
s1.invoice_date BETWEEN '2018-01-01'AND '2018-02-01'
AND s1.status = true
AND s1.account_id = mainaccounts.account_id group BY s1.account_id ),
0) AS 'SALESTHIS', ifnull(
(
SELECT sum(s2.volumece)
FROM sales s2
WHERE s2.org_id = '5a1da86ed6ea7c6000e45e82'
AND (
s2.invoice_date BETWEEN '2017-01-01' AND '2017-02-01'
AND s2.status = TRUE
AND s2.account_id = mainaccounts.account_id GROUP BY s2.account_id ),
0) AS 'SALESLAST', @podthis := ifnull(
(
SELECT sum(s1.units)
FROM sales s1
WHERE s1.org_id = '5a1da86ed6ea7c6000e45e82'
AND (
s1.invoice_date BETWEEN '2018-01-01'AND '2018-02-01'
AND s1.status = TRUE
AND s1.account_id = mainaccounts.account_id GROUP BY s1.account_id ),
0) AS 'UNITSTHIS', @podlast :=ifnull(
(
SELECT sum(s2.units)
FROM sales s2
WHERE s2.org_id = '5a1da86ed6ea7c6000e45e82'
AND (
s2.invoice_date BETWEEN '2017-01-01' AND '2017-02-01')
AND s2.status = TRUE
AND s2.account_id = mainaccounts.account_id
GROUP BY s2.account_id ),0) AS 'UNITSLAST',
CASE
WHEN (
@podthis IS NULL
OR @podthis <= 0) THEN 0
ELSE 1
end AS 'ISPODTHIS',
CASE
WHEN (
@podlast IS NULL
OR @podlast <= 0) THEN 0
ELSE 1
end AS 'ISPODLAST' FROM activity mainaccounts WHERE
mainaccounts.org_id = '5a1da86ed6ea7c6000e45e82'
AND mainaccounts.started_at BETWEEN '2018-12-01' AND
'2018-12-31'
AND mainaccounts.status = TRUE
AND mainaccounts.activity_id = '5a1da86ed6ea7c6000e45e8e'
GROUP BY account_id
我有很多索引,所以请问是否有您认为需要或有帮助的特定索引。
答案 0 :(得分:1)
汇总应该到 day 。通过汇总汇总表,可以使用任何日期范围。
对于其他“动态”事物,您需要已经建立了包含可能的动态列的摘要表,并在摘要表上提供了“足够的”索引。然后在界面中添加一些技巧以选择适当的摘要表。
根据我的经验(执行您所描述的多个项目),在摘要表中选择需要的列一直很合理,甚至可以定制UI页面以将用户引导到 选择。有时,会有新的请求进入;然后我敲出新代码以将原始数据汇总到新的Summary表(或扩充现有表)中,敲出UI,然后完成工作。
侧面问题...
index2
是什么,涉及什么数据类型?我担心解释中的func
。
范围
started_at在“ 2017-01-01”和“ 2017-02-01”之间
如果目标是DATE
,则您有32天。如果它是DATETIME
,则您有31天加一秒(一个额外的午夜)。我推荐这种模式;它适用于所有日期类型,和避免了leap年(等)麻烦:
started_at >= '2017-01-01'
AND started_at < '2017-01-01' + INTERVAL 1 MONTH
主账户有
INDEX(org_id, status, activity_id, -- in any order
started_at) -- after the others
也就是说,先设置=
,然后再设置“范围”。
活动需求
INDEX(org_id, status, account_id, activity_id, -- in any order, then
started_at)