我有一个由SQLAlchemy ORM生成的查询。它应该检索特定课程的stream_items及其所有部分 - 资源,内容文本块等,以及发布它们的用户。但是,这个查询似乎非常慢,我们的生产数据库需要几分钟,数据库中有大约20,000个用户,课程大约有25个stream_items,每个stream_item有几个内容文本块。请注意,除了数据库中的用户之外,其他任何记录都很少,因为我们导入了大量用户但内容非常少。
编辑:请注意,每个对象id都是franklin_object表中的外键。
我已经尝试查看查询,并确定了几个令人不安的位(查看EXPLAIN输出)
但是,我真的不知道如何处理这些问题,特别是后两个问题。
以下是查询:
SELECT stream_item.id AS stream_item_id,
franklin_object.id AS franklin_object_id,
franklin_object.type AS franklin_object_type,
franklin_object.uuid AS franklin_object_uuid,
stream_item.parent_id AS stream_item_parent_id,
stream_item.shown_at AS stream_item_shown_at,
stream_item.author_id AS stream_item_author_id,
stream_item.stream_sort_at AS stream_item_stream_sort_at,
stream_item.PUBLIC AS stream_item_public,
stream_item.created_at AS stream_item_created_at,
stream_item.updated_at AS stream_item_updated_at,
anon_1.content_text_block_text AS anon_1_content_text_block_text,
anon_2.resource_id AS anon_2_resource_id,
anon_2.franklin_object_id AS anon_2_franklin_object_id,
anon_2.franklin_object_type AS anon_2_franklin_object_type,
anon_2.franklin_object_uuid AS anon_2_franklin_object_uuid,
anon_2.resource_top_parent_resource AS anon_2_resource_top_parent_resource,
anon_2.resource_top_parent_id AS anon_2_resource_top_parent_id,
anon_2.resource_title AS anon_2_resource_title,
anon_2.resource_url AS anon_2_resource_url,
anon_2.resource_image AS anon_2_resource_image,
anon_2.resource_created_at AS anon_2_resource_created_at,
anon_2.resource_updated_at AS anon_2_resource_updated_at,
franklin_object_1.id AS franklin_object_1_id,
franklin_object_1.type AS franklin_object_1_type,
franklin_object_1.uuid AS franklin_object_1_uuid,
anon_1.content_text_block_id AS anon_1_content_text_block_id,
anon_1.franklin_object_id AS anon_1_franklin_object_id,
anon_1.franklin_object_type AS anon_1_franklin_object_type,
anon_1.franklin_object_uuid AS anon_1_franklin_object_uuid,
anon_1.content_text_block_position AS anon_1_content_text_block_position,
anon_1.content_text_block_franklin_object_id AS anon_1_content_text_block_franklin_object_id,
anon_1.content_text_block_created_at AS anon_1_content_text_block_created_at,
anon_1.content_text_block_updated_at AS anon_1_content_text_block_updated_at,
anon_3.user_password AS anon_3_user_password,
anon_3.user_auth_token AS anon_3_user_auth_token,
anon_3.user_id AS anon_3_user_id,
anon_3.franklin_object_id AS anon_3_franklin_object_id,
anon_3.franklin_object_type AS anon_3_franklin_object_type,
anon_3.franklin_object_uuid AS anon_3_franklin_object_uuid,
anon_3.user_email AS anon_3_user_email,
anon_3.user_auth_token_expiration AS anon_3_user_auth_token_expiration,
anon_3.user_active AS anon_3_user_active,
anon_3.user_activation_token AS anon_3_user_activation_token,
anon_3.user_first_name AS anon_3_user_first_name,
anon_3.user_last_name AS anon_3_user_last_name,
anon_3.user_image AS anon_3_user_image,
anon_3.user_bio AS anon_3_user_bio,
anon_3.user_aspirations AS anon_3_user_aspirations,
anon_3.user_website AS anon_3_user_website,
anon_3.user_resume AS anon_3_user_resume,
anon_3.user_resume_name AS anon_3_user_resume_name,
anon_3.user_primary_role AS anon_3_user_primary_role,
anon_3.user_institution_id AS anon_3_user_institution_id,
anon_3.user_birth_date AS anon_3_user_birth_date,
anon_3.user_gender AS anon_3_user_gender,
anon_3.user_graduation_year AS anon_3_user_graduation_year,
anon_3.user_complete AS anon_3_user_complete,
anon_3.user_masthead_y_position AS anon_3_user_masthead_y_position,
anon_3.user_masthead AS anon_3_user_masthead,
anon_3.user_fb_access_token AS anon_3_user_fb_access_token,
anon_3.user_fb_user_id AS anon_3_user_fb_user_id,
anon_3.user_location AS anon_3_user_location,
anon_3.user_created_at AS anon_3_user_created_at,
anon_3.user_updated_at AS anon_3_user_updated_at,
anon_4.content_text_block_text AS anon_4_content_text_block_text,
anon_4.content_text_block_id AS anon_4_content_text_block_id,
anon_4.franklin_object_id AS anon_4_franklin_object_id,
anon_4.franklin_object_type AS anon_4_franklin_object_type,
anon_4.franklin_object_uuid AS anon_4_franklin_object_uuid,
anon_4.content_text_block_position AS anon_4_content_text_block_position,
anon_4.content_text_block_franklin_object_id AS anon_4_content_text_block_franklin_object_id,
anon_4.content_text_block_created_at AS anon_4_content_text_block_created_at,
anon_4.content_text_block_updated_at AS anon_4_content_text_block_updated_at,
anon_5.user_password AS anon_5_user_password,
anon_5.user_auth_token AS anon_5_user_auth_token,
anon_5.user_id AS anon_5_user_id,
anon_5.franklin_object_id AS anon_5_franklin_object_id,
anon_5.franklin_object_type AS anon_5_franklin_object_type,
anon_5.franklin_object_uuid AS anon_5_franklin_object_uuid,
anon_5.user_email AS anon_5_user_email,
anon_5.user_auth_token_expiration AS anon_5_user_auth_token_expiration,
anon_5.user_active AS anon_5_user_active,
anon_5.user_activation_token AS anon_5_user_activation_token,
anon_5.user_first_name AS anon_5_user_first_name,
anon_5.user_last_name AS anon_5_user_last_name,
anon_5.user_image AS anon_5_user_image,
anon_5.user_bio AS anon_5_user_bio,
anon_5.user_aspirations AS anon_5_user_aspirations,
anon_5.user_website AS anon_5_user_website,
anon_5.user_resume AS anon_5_user_resume,
anon_5.user_resume_name AS anon_5_user_resume_name,
anon_5.user_primary_role AS anon_5_user_primary_role,
anon_5.user_institution_id AS anon_5_user_institution_id,
anon_5.user_birth_date AS anon_5_user_birth_date,
anon_5.user_gender AS anon_5_user_gender,
anon_5.user_graduation_year AS anon_5_user_graduation_year,
anon_5.user_complete AS anon_5_user_complete,
anon_5.user_masthead_y_position AS anon_5_user_masthead_y_position,
anon_5.user_masthead AS anon_5_user_masthead,
anon_5.user_fb_access_token AS anon_5_user_fb_access_token,
anon_5.user_fb_user_id AS anon_5_user_fb_user_id,
anon_5.user_location AS anon_5_user_location,
anon_5.user_created_at AS anon_5_user_created_at,
anon_5.user_updated_at AS anon_5_user_updated_at,
anon_6.stream_item_id AS anon_6_stream_item_id,
anon_6.franklin_object_id AS anon_6_franklin_object_id,
anon_6.franklin_object_type AS anon_6_franklin_object_type,
anon_6.franklin_object_uuid AS anon_6_franklin_object_uuid,
anon_6.stream_item_parent_id AS anon_6_stream_item_parent_id,
anon_6.stream_item_shown_at AS anon_6_stream_item_shown_at,
anon_6.stream_item_author_id AS anon_6_stream_item_author_id,
anon_6.stream_item_stream_sort_at AS anon_6_stream_item_stream_sort_at,
anon_6.stream_item_public AS anon_6_stream_item_public,
anon_6.stream_item_created_at AS anon_6_stream_item_created_at,
anon_6.stream_item_updated_at AS anon_6_stream_item_updated_at
FROM franklin_object
INNER JOIN stream_item
ON franklin_object.id = stream_item.id
INNER JOIN (SELECT franklin_object.id AS franklin_object_id,
franklin_object.type AS franklin_object_type,
franklin_object.uuid AS franklin_object_uuid,
content_text_block.id AS content_text_block_id,
content_text_block.text AS content_text_block_text,
content_text_block.position AS content_text_block_position,
content_text_block.franklin_object_id AS content_text_block_franklin_object_id,
content_text_block.created_at AS content_text_block_created_at,
content_text_block.updated_at AS content_text_block_updated_at
FROM franklin_object
INNER JOIN content_text_block
ON franklin_object.id = content_text_block.id) AS anon_1
ON stream_item.id = anon_1.content_text_block_franklin_object_id
LEFT OUTER JOIN contents_resources AS contents_resources_1
ON anon_1.content_text_block_id = contents_resources_1.content_id
LEFT OUTER JOIN (SELECT franklin_object.id AS franklin_object_id,
franklin_object.type AS franklin_object_type,
franklin_object.uuid AS franklin_object_uuid,
resource.id AS resource_id,
resource.top_parent_resource AS resource_top_parent_resource,
resource.top_parent_id AS resource_top_parent_id,
resource.title AS resource_title,
resource.url AS resource_url,
resource.image AS resource_image,
resource.created_at AS resource_created_at,
resource.updated_at AS resource_updated_at
FROM franklin_object
INNER JOIN resource
ON franklin_object.id = resource.id) AS anon_2
ON anon_2.resource_id = contents_resources_1.resource_id
LEFT OUTER JOIN contents_franklin_objects AS contents_franklin_objects_1
ON anon_1.content_text_block_id = contents_franklin_objects_1.content_id
LEFT OUTER JOIN franklin_object AS franklin_object_1
ON franklin_object_1.id = contents_franklin_objects_1.franklin_object_id
LEFT OUTER JOIN likers AS likers_1
ON stream_item.id = likers_1.post_id
LEFT OUTER JOIN (SELECT franklin_object.id AS franklin_object_id,
franklin_object.type AS franklin_object_type,
franklin_object.uuid AS franklin_object_uuid,
USER.id AS user_id,
USER.email AS user_email,
USER.password AS user_password,
USER.auth_token AS user_auth_token,
USER.auth_token_expiration AS user_auth_token_expiration,
USER.active AS user_active,
USER.activation_token AS user_activation_token,
USER.first_name AS user_first_name,
USER.last_name AS user_last_name,
USER.image AS user_image,
USER.bio AS user_bio,
USER.aspirations AS user_aspirations,
USER.website AS user_website,
USER.resume AS user_resume,
USER.resume_name AS user_resume_name,
USER.primary_role AS user_primary_role,
USER.institution_id AS user_institution_id,
USER.birth_date AS user_birth_date,
USER.gender AS user_gender,
USER.graduation_year AS user_graduation_year,
USER.complete AS user_complete,
USER.masthead_y_position AS user_masthead_y_position,
USER.masthead AS user_masthead,
USER.fb_access_token AS user_fb_access_token,
USER.fb_user_id AS user_fb_user_id,
USER.location AS user_location,
USER.created_at AS user_created_at,
USER.updated_at AS user_updated_at
FROM franklin_object
INNER JOIN USER
ON franklin_object.id = USER.id) AS anon_3
ON anon_3.user_id = likers_1.user_id
LEFT OUTER JOIN contents_franklin_objects AS contents_franklin_objects_2
ON franklin_object.id = contents_franklin_objects_2.franklin_object_id
LEFT OUTER JOIN (SELECT franklin_object.id AS franklin_object_id,
franklin_object.type AS franklin_object_type,
franklin_object.uuid AS franklin_object_uuid,
content_text_block.id AS content_text_block_id,
content_text_block.text AS content_text_block_text,
content_text_block.position AS content_text_block_position,
content_text_block.franklin_object_id AS content_text_block_franklin_object_id,
content_text_block.created_at AS content_text_block_created_at,
content_text_block.updated_at AS content_text_block_updated_at
FROM franklin_object
INNER JOIN content_text_block
ON franklin_object.id = content_text_block.id) AS anon_4
ON anon_4.content_text_block_id = contents_franklin_objects_2.content_id
LEFT OUTER JOIN (SELECT franklin_object.id AS franklin_object_id,
franklin_object.type AS franklin_object_type,
franklin_object.uuid AS franklin_object_uuid,
stream_item.id AS stream_item_id,
stream_item.parent_id AS stream_item_parent_id,
stream_item.shown_at AS stream_item_shown_at,
stream_item.author_id AS stream_item_author_id,
stream_item.stream_sort_at AS stream_item_stream_sort_at,
stream_item.PUBLIC AS stream_item_public,
stream_item.created_at AS stream_item_created_at,
stream_item.updated_at AS stream_item_updated_at
FROM franklin_object
INNER JOIN stream_item
ON franklin_object.id = stream_item.id) AS anon_6
ON anon_6.stream_item_parent_id = franklin_object.id
LEFT OUTER JOIN likers AS likers_2
ON anon_6.stream_item_id = likers_2.post_id
LEFT OUTER JOIN (SELECT franklin_object.id AS franklin_object_id,
franklin_object.type AS franklin_object_type,
franklin_object.uuid AS franklin_object_uuid,
USER.id AS user_id,
USER.email AS user_email,
USER.password AS user_password,
USER.auth_token AS user_auth_token,
USER.auth_token_expiration AS user_auth_token_expiration,
USER.active AS user_active,
USER.activation_token AS user_activation_token,
USER.first_name AS user_first_name,
USER.last_name AS user_last_name,
USER.image AS user_image,
USER.bio AS user_bio,
USER.aspirations AS user_aspirations,
USER.website AS user_website,
USER.resume AS user_resume,
USER.resume_name AS user_resume_name,
USER.primary_role AS user_primary_role,
USER.institution_id AS user_institution_id,
USER.birth_date AS user_birth_date,
USER.gender AS user_gender,
USER.graduation_year AS user_graduation_year,
USER.complete AS user_complete,
USER.masthead_y_position AS user_masthead_y_position,
USER.masthead AS user_masthead,
USER.fb_access_token AS user_fb_access_token,
USER.fb_user_id AS user_fb_user_id,
USER.location AS user_location,
USER.created_at AS user_created_at,
USER.updated_at AS user_updated_at
FROM franklin_object
INNER JOIN USER
ON franklin_object.id = USER.id) AS anon_5
ON anon_5.user_id = likers_2.user_id
WHERE stream_item.parent_id = 11
ORDER BY stream_item.stream_sort_at DESC,
anon_1.content_text_block_position,
anon_6.stream_item_stream_sort_at DESC
EXPLAIN输出:
ID SELECT_TYPE TABLE POSSIBLY_KEYS KEY KEY_LEN REF ROWS EXTRA
1 PRIMARY <derived2> ALL NULL NULL NULL NULL 599 Using temporary; Using filesort
1 PRIMARY stream_item eq_ref PRIMARY,parent_id PRIMARY 4 anon_1.content_text_block_franklin_object_id 1 Using where
1 PRIMARY contents_resources_1 ref content_id content_id 5 anon_1.content_text_block_id 2
1 PRIMARY <derived3> ALL NULL NULL NULL NULL 7
1 PRIMARY contents_franklin_objects_1 ref content_id content_id 5 anon_1.content_text_block_id 1
1 PRIMARY franklin_object eq_ref PRIMARY PRIMARY 4 franklin.stream_item.id 1 Using where
1 PRIMARY franklin_object_1 eq_ref PRIMARY PRIMARY 4 franklin.contents_franklin_objects_1.franklin_object_id 1
1 PRIMARY likers_1 ref post_id post_id 5 franklin.stream_item.id 1
1 PRIMARY <derived4> ALL NULL NULL NULL NULL 136
1 PRIMARY contents_franklin_objects_2 ref franklin_object_id franklin_object_id 5 franklin.stream_item.id 1
1 PRIMARY <derived5> ALL NULL NULL NULL NULL 599
1 PRIMARY <derived6> ALL NULL NULL NULL NULL 608
1 PRIMARY likers_2 ref post_id post_id 5 anon_6.stream_item_id 1
1 PRIMARY <derived7> ALL NULL NULL NULL NULL 136
7 DERIVED user ALL PRIMARY NULL NULL NULL 133
7 DERIVED franklin_object eq_ref PRIMARY PRIMARY 4 franklin.user.id 1
6 DERIVED stream_item ALL PRIMARY NULL NULL NULL 709
6 DERIVED franklin_object eq_ref PRIMARY PRIMARY 4 franklin.stream_item.id 1
5 DERIVED content_text_block ALL PRIMARY NULL NULL NULL 666
5 DERIVED franklin_object eq_ref PRIMARY PRIMARY 4 franklin.content_text_block.id 1
4 DERIVED user ALL PRIMARY NULL NULL NULL 133
4 DERIVED franklin_object eq_ref PRIMARY PRIMARY 4 franklin.user.id 1
3 DERIVED resource ALL PRIMARY NULL NULL NULL 7
3 DERIVED franklin_object eq_ref PRIMARY PRIMARY 4 franklin.resource.id 1
2 DERIVED content_text_block ALL PRIMARY NULL NULL NULL 666
2 DERIVED franklin_object eq_ref PRIMARY PRIMARY 4 franklin.content_text_block.id 1
如何将所有查询减少到更快的速度?有什么其他方法可以加快这个速度?
franklin_objects的设置方式是反模式吗?它的工作方式是franklin_object表有两列:id和type。然后每个类型都是一个表,主键是franklin_object的外键。
生成sql的代码类似于:
stream_item_query = StreamItem.query.options(db.joinedload('stream_items'),db.joinedload('contents_included_in'),db.joinedload('contents.resources'),db.joinedload('contents.objects'),db.subqueryload('likers'))
stream_items = stream_item_query.filter(StreamItem.parent_id == community_id).order_by(db.desc(StreamItem.stream_sort_at)).all()
答案 0 :(得分:7)
我的建议是重新考虑你的整个方法。
SQLAlchemy是一个非常好的工具,我不会打击它(或者你选择的mysql),但是和大多数ORM工具一样,你需要考虑使用它们的成本。一个例子是这个franklin_object
表业务。这是反模式吗?是和否。从纯粹的OO角度来看它是有道理的。您可以通过在此表中查找id
来确定要查询的表。从关系查询的角度来看,它的用途很少。我可以从您的查询中删除franklin_object
的每个实例,但只会遗失franklin_object
中的列。如果这是一个可行的选择,我会立即这样做。
让我们进一步检查这个与franklin_object
的链接。查看子查询,它们都具有相同的形式:
SELECT franklin_object.id AS franklin_object_id,
franklin_object.type AS franklin_object_type,
franklin_object.uuid AS franklin_object_uuid,
linked_table.id AS linked_table_id,
linked_table.col2 AS col2 --and more
FROM franklin_object
INNER JOIN linked_table
ON franklin_object.id = linked_table.id) AS anon_n
关于如何优化查询的这一部分,数据库没有太多信息,无论统计数据如何。也许如果通过在franklin_object
子句中指定type
来限制where
,则查询会更好。也许。
这对USER表尤其有问题,因为这个表有很多记录(所以你说)。由于您要查询大多数列,并且优化器无法准确计算出将检索的行数,因此执行全表扫描是有意义的。在你的情况下,两次。
另一个方面是涉及的连接数量很多。如果我们取出所有franklin_object
引用,仍然有11个连接。如果您的数据模型更具关系性,那并不是很糟糕,但事实并非如此。生成的查询对数据库没有太大帮助,无法找出执行查询的最佳方法,因此它不能很好地完成任务。也许你可以用提示等来缓解这个问题,但我敢打赌,从长远来看,这会让你感到厌烦。
您正在使用ORM工具,因此确实使用它。通过一次完成如此大的查询,您无法获得任何收益。它可能会因性能而分开一些。执行延迟检索以避免大量复杂的查询。我会说尝试,只是为了看看它是怎么回事,懒洋洋地做所有事情。性能可能会好,我会说更好。不太好,可能甚至不能接受,但比数据库搅拌时能喝咖啡更好。
然后,开始将事物拼凑成更精简的块。将逻辑上有意义的对象绑定在一起,例如resource
和contents_resources
。另一个例子,stream_item
,likers
和user
之间的连接是重复的。制作一个查询并让SQLAlchemy做它的事情。
作为最后的手段,可以实现某种缓存机制。也许在某处对表格进行非规范化。在缓慢变化,读取繁重的系统上,您可以将这些表格输入到另一个结构中,其中查询是直接且快速的。也就是说,预先进行处理并将其存储在单个表中。
祝你好运