我正在寻求帮助来优化这个mysql查询。需要超长时间才能运行,因为main_activity下的两个表都很大(每个超过1000万行!)。 main_db.members和main_db.customers分别约为400K和600K行。
编辑:
根据使用临时表的建议,只想添加我在只读数据库上运行查询,因此临时表可能是个问题。不使用临时表我可以做什么优化?
select distinct
a.members_id,
a.customer_id,
a.subscription,
a.buy_date,
from_unixtime((max(m2.sales_date) / 1000), '%m/%d/%Y') as sales_date,
a.return_date,
a.signup_date,
from_unixtime((max(st.visit_date) / 1000), '%m/%d/%Y') as visit_date
from (select distinct
m1.members_id,
m1.customer_id,
m1.subscription,
from_unixtime((m1.buy_date / 1000), '%m/%d/%Y') as buy_date,
from_unixtime((m1.return_date / 1000), '%m/%d/%Y') as return_date,
from_unixtime((c.signup_date / 1000), '%m/%d/%Y') as signup_date
from main_db.members m1
join main_db.customer c on c.global_members_id = m1.members_id
) as a
left join main_db.members m2 on m2.customer_id = a.customer_id
left join main_activity.onlinevisit s on s.customer_id = a.customer_id
left join main_activity.storevisit st on st.visit_id = s.visit_id
答案 0 :(得分:0)
我们的想法是创建一个带有好密钥的临时表。我们可以从这开始:
create temporary table a (key(customer_id)) select distinct
m1.members_id,
m1.customer_id,
m1.subscription,
from_unixtime((m1.buy_date / 1000), '%m/%d/%Y') as buy_date,
from_unixtime((m1.return_date / 1000), '%m/%d/%Y') as return_date,
from_unixtime((c.signup_date / 1000), '%m/%d/%Y') as signup_date
from main_db.members m1
join main_db.customer c on c.global_members_id = m1.members_id;
select distinct
a.members_id,
a.customer_id,
a.subscription,
a.buy_date,
from_unixtime((max(m2.sales_date) / 1000), '%m/%d/%Y') as sales_date,
a.return_date,
a.signup_date,
from_unixtime((max(st.visit_date) / 1000), '%m/%d/%Y') as visit_date
from a
left join main_db.members m2 on m2.customer_id = a.customer_id
left join main_activity.onlinevisit s on s.customer_id = a.customer_id
left join main_activity.storevisit st on st.visit_id = s.visit_id;
您还需要确保在其他表中也有好键。
答案 1 :(得分:0)
请提供SHOW CREATE TABLE
。
我希望
上有索引m2.customer_id
s.customer_id
st.visit_id
如果没有,那可能是一个重大的性能问题。
使用DISTINCT
意味着JOINs
乘以行数,您需要对其进行缩小。没有DISTINCT
,每个查询都能正常工作吗?消除它会节省数据传递。
避免通货膨胀通缩开销的另一种可能性是取代
max(m2.sales_date)
与
( SELECT max(m2.sales_date)
FROM main_db.members m2
WHERE m2.customer_id = a.customer_id )
(等)