如何优化SQLAlchemy生成的查询?

时间:2012-07-16 18:03:58

标签: mysql sql orm sqlalchemy

我有一个由SQLAlchemy ORM生成的查询。它应该检索特定课程的stream_items及其所有部分 - 资源,内容文本块等,以及发布它们的用户。但是,这个查询似乎非常慢,我们的生产数据库需要几分钟,数据库中有大约20,000个用户,课程大约有25个stream_items,每个stream_item有几个内容文本块。请注意,除了数据库中的用户之外,其他任何记录都很少,因为我们导入了大量用户但内容非常少。

编辑:请注意,每个对象id都是franklin_object表中的外键。

我已经尝试查看查询,并确定了几个令人不安的位(查看EXPLAIN输出)

  1. 其中一个查找是'使用临时;使用filesort'。
  2. 用户表被点击两次,没有索引
  3. 内容文本块表被点击两次,没有索引
  4. 但是,我真的不知道如何处理这些问题,特别是后两个问题。

    以下是查询:

    SELECT stream_item.id                               AS stream_item_id,
           franklin_object.id                           AS franklin_object_id,
           franklin_object.type                         AS franklin_object_type,
           franklin_object.uuid                         AS franklin_object_uuid,
           stream_item.parent_id                        AS stream_item_parent_id,
           stream_item.shown_at                         AS stream_item_shown_at,
           stream_item.author_id                        AS stream_item_author_id,
           stream_item.stream_sort_at                   AS stream_item_stream_sort_at,
           stream_item.PUBLIC                           AS stream_item_public,
           stream_item.created_at                       AS stream_item_created_at,
           stream_item.updated_at                       AS stream_item_updated_at,
           anon_1.content_text_block_text               AS anon_1_content_text_block_text,
           anon_2.resource_id                           AS anon_2_resource_id,
           anon_2.franklin_object_id                    AS anon_2_franklin_object_id,
           anon_2.franklin_object_type                  AS anon_2_franklin_object_type,
           anon_2.franklin_object_uuid                  AS anon_2_franklin_object_uuid,
           anon_2.resource_top_parent_resource          AS anon_2_resource_top_parent_resource,
           anon_2.resource_top_parent_id                AS anon_2_resource_top_parent_id,
           anon_2.resource_title                        AS anon_2_resource_title,
           anon_2.resource_url                          AS anon_2_resource_url,
           anon_2.resource_image                        AS anon_2_resource_image,
           anon_2.resource_created_at                   AS anon_2_resource_created_at,
           anon_2.resource_updated_at                   AS anon_2_resource_updated_at,
           franklin_object_1.id                         AS franklin_object_1_id,
           franklin_object_1.type                       AS franklin_object_1_type,
           franklin_object_1.uuid                       AS franklin_object_1_uuid,
           anon_1.content_text_block_id                 AS anon_1_content_text_block_id,
           anon_1.franklin_object_id                    AS anon_1_franklin_object_id,
           anon_1.franklin_object_type                  AS anon_1_franklin_object_type,
           anon_1.franklin_object_uuid                  AS anon_1_franklin_object_uuid,
           anon_1.content_text_block_position           AS anon_1_content_text_block_position,
           anon_1.content_text_block_franklin_object_id AS anon_1_content_text_block_franklin_object_id,
           anon_1.content_text_block_created_at         AS anon_1_content_text_block_created_at,
           anon_1.content_text_block_updated_at         AS anon_1_content_text_block_updated_at,
           anon_3.user_password                         AS anon_3_user_password,
           anon_3.user_auth_token                       AS anon_3_user_auth_token,
           anon_3.user_id                               AS anon_3_user_id,
           anon_3.franklin_object_id                    AS anon_3_franklin_object_id,
           anon_3.franklin_object_type                  AS anon_3_franklin_object_type,
           anon_3.franklin_object_uuid                  AS anon_3_franklin_object_uuid,
           anon_3.user_email                            AS anon_3_user_email,
           anon_3.user_auth_token_expiration            AS anon_3_user_auth_token_expiration,
           anon_3.user_active                           AS anon_3_user_active,
           anon_3.user_activation_token                 AS anon_3_user_activation_token,
           anon_3.user_first_name                       AS anon_3_user_first_name,
           anon_3.user_last_name                        AS anon_3_user_last_name,
           anon_3.user_image                            AS anon_3_user_image,
           anon_3.user_bio                              AS anon_3_user_bio,
           anon_3.user_aspirations                      AS anon_3_user_aspirations,
           anon_3.user_website                          AS anon_3_user_website,
           anon_3.user_resume                           AS anon_3_user_resume,
           anon_3.user_resume_name                      AS anon_3_user_resume_name,
           anon_3.user_primary_role                     AS anon_3_user_primary_role,
           anon_3.user_institution_id                   AS anon_3_user_institution_id,
           anon_3.user_birth_date                       AS anon_3_user_birth_date,
           anon_3.user_gender                           AS anon_3_user_gender,
           anon_3.user_graduation_year                  AS anon_3_user_graduation_year,
           anon_3.user_complete                         AS anon_3_user_complete,
           anon_3.user_masthead_y_position              AS anon_3_user_masthead_y_position,
           anon_3.user_masthead                         AS anon_3_user_masthead,
           anon_3.user_fb_access_token                  AS anon_3_user_fb_access_token,
           anon_3.user_fb_user_id                       AS anon_3_user_fb_user_id,
           anon_3.user_location                         AS anon_3_user_location,
           anon_3.user_created_at                       AS anon_3_user_created_at,
           anon_3.user_updated_at                       AS anon_3_user_updated_at,
           anon_4.content_text_block_text               AS anon_4_content_text_block_text,
           anon_4.content_text_block_id                 AS anon_4_content_text_block_id,
           anon_4.franklin_object_id                    AS anon_4_franklin_object_id,
           anon_4.franklin_object_type                  AS anon_4_franklin_object_type,
           anon_4.franklin_object_uuid                  AS anon_4_franklin_object_uuid,
           anon_4.content_text_block_position           AS anon_4_content_text_block_position,
           anon_4.content_text_block_franklin_object_id AS anon_4_content_text_block_franklin_object_id,
           anon_4.content_text_block_created_at         AS anon_4_content_text_block_created_at,
           anon_4.content_text_block_updated_at         AS anon_4_content_text_block_updated_at,
           anon_5.user_password                         AS anon_5_user_password,
           anon_5.user_auth_token                       AS anon_5_user_auth_token,
           anon_5.user_id                               AS anon_5_user_id,
           anon_5.franklin_object_id                    AS anon_5_franklin_object_id,
           anon_5.franklin_object_type                  AS anon_5_franklin_object_type,
           anon_5.franklin_object_uuid                  AS anon_5_franklin_object_uuid,
           anon_5.user_email                            AS anon_5_user_email,
           anon_5.user_auth_token_expiration            AS anon_5_user_auth_token_expiration,
           anon_5.user_active                           AS anon_5_user_active,
           anon_5.user_activation_token                 AS anon_5_user_activation_token,
           anon_5.user_first_name                       AS anon_5_user_first_name,
           anon_5.user_last_name                        AS anon_5_user_last_name,
           anon_5.user_image                            AS anon_5_user_image,
           anon_5.user_bio                              AS anon_5_user_bio,
           anon_5.user_aspirations                      AS anon_5_user_aspirations,
           anon_5.user_website                          AS anon_5_user_website,
           anon_5.user_resume                           AS anon_5_user_resume,
           anon_5.user_resume_name                      AS anon_5_user_resume_name,
           anon_5.user_primary_role                     AS anon_5_user_primary_role,
           anon_5.user_institution_id                   AS anon_5_user_institution_id,
           anon_5.user_birth_date                       AS anon_5_user_birth_date,
           anon_5.user_gender                           AS anon_5_user_gender,
           anon_5.user_graduation_year                  AS anon_5_user_graduation_year,
           anon_5.user_complete                         AS anon_5_user_complete,
           anon_5.user_masthead_y_position              AS anon_5_user_masthead_y_position,
           anon_5.user_masthead                         AS anon_5_user_masthead,
           anon_5.user_fb_access_token                  AS anon_5_user_fb_access_token,
           anon_5.user_fb_user_id                       AS anon_5_user_fb_user_id,
           anon_5.user_location                         AS anon_5_user_location,
           anon_5.user_created_at                       AS anon_5_user_created_at,
           anon_5.user_updated_at                       AS anon_5_user_updated_at,
           anon_6.stream_item_id                        AS anon_6_stream_item_id,
           anon_6.franklin_object_id                    AS anon_6_franklin_object_id,
           anon_6.franklin_object_type                  AS anon_6_franklin_object_type,
           anon_6.franklin_object_uuid                  AS anon_6_franklin_object_uuid,
           anon_6.stream_item_parent_id                 AS anon_6_stream_item_parent_id,
           anon_6.stream_item_shown_at                  AS anon_6_stream_item_shown_at,
           anon_6.stream_item_author_id                 AS anon_6_stream_item_author_id,
           anon_6.stream_item_stream_sort_at            AS anon_6_stream_item_stream_sort_at,
           anon_6.stream_item_public                    AS anon_6_stream_item_public,
           anon_6.stream_item_created_at                AS anon_6_stream_item_created_at,
           anon_6.stream_item_updated_at                AS anon_6_stream_item_updated_at
    FROM   franklin_object
           INNER JOIN stream_item
                   ON franklin_object.id = stream_item.id
           INNER JOIN (SELECT franklin_object.id                    AS franklin_object_id,
                              franklin_object.type                  AS franklin_object_type,
                              franklin_object.uuid                  AS franklin_object_uuid,
                              content_text_block.id                 AS content_text_block_id,
                              content_text_block.text               AS content_text_block_text,
                              content_text_block.position           AS content_text_block_position,
                              content_text_block.franklin_object_id AS content_text_block_franklin_object_id,
                              content_text_block.created_at         AS content_text_block_created_at,
                              content_text_block.updated_at         AS content_text_block_updated_at
                       FROM   franklin_object
                              INNER JOIN content_text_block
                                      ON franklin_object.id = content_text_block.id) AS anon_1
                   ON stream_item.id = anon_1.content_text_block_franklin_object_id
           LEFT OUTER JOIN contents_resources AS contents_resources_1
                        ON anon_1.content_text_block_id = contents_resources_1.content_id
           LEFT OUTER JOIN (SELECT franklin_object.id           AS franklin_object_id,
                                   franklin_object.type         AS franklin_object_type,
                                   franklin_object.uuid         AS franklin_object_uuid,
                                   resource.id                  AS resource_id,
                                   resource.top_parent_resource AS resource_top_parent_resource,
                                   resource.top_parent_id       AS resource_top_parent_id,
                                   resource.title               AS resource_title,
                                   resource.url                 AS resource_url,
                                   resource.image               AS resource_image,
                                   resource.created_at          AS resource_created_at,
                                   resource.updated_at          AS resource_updated_at
                            FROM   franklin_object
                                   INNER JOIN resource
                                           ON franklin_object.id = resource.id) AS anon_2
                        ON anon_2.resource_id = contents_resources_1.resource_id
           LEFT OUTER JOIN contents_franklin_objects AS contents_franklin_objects_1
                        ON anon_1.content_text_block_id = contents_franklin_objects_1.content_id
           LEFT OUTER JOIN franklin_object AS franklin_object_1
                        ON franklin_object_1.id = contents_franklin_objects_1.franklin_object_id
           LEFT OUTER JOIN likers AS likers_1
                        ON stream_item.id = likers_1.post_id
           LEFT OUTER JOIN (SELECT franklin_object.id         AS franklin_object_id,
                                   franklin_object.type       AS franklin_object_type,
                                   franklin_object.uuid       AS franklin_object_uuid,
                                   USER.id                    AS user_id,
                                   USER.email                 AS user_email,
                                   USER.password              AS user_password,
                                   USER.auth_token            AS user_auth_token,
                                   USER.auth_token_expiration AS user_auth_token_expiration,
                                   USER.active                AS user_active,
                                   USER.activation_token      AS user_activation_token,
                                   USER.first_name            AS user_first_name,
                                   USER.last_name             AS user_last_name,
                                   USER.image                 AS user_image,
                                   USER.bio                   AS user_bio,
                                   USER.aspirations           AS user_aspirations,
                                   USER.website               AS user_website,
                                   USER.resume                AS user_resume,
                                   USER.resume_name           AS user_resume_name,
                                   USER.primary_role          AS user_primary_role,
                                   USER.institution_id        AS user_institution_id,
                                   USER.birth_date            AS user_birth_date,
                                   USER.gender                AS user_gender,
                                   USER.graduation_year       AS user_graduation_year,
                                   USER.complete              AS user_complete,
                                   USER.masthead_y_position   AS user_masthead_y_position,
                                   USER.masthead              AS user_masthead,
                                   USER.fb_access_token       AS user_fb_access_token,
                                   USER.fb_user_id            AS user_fb_user_id,
                                   USER.location              AS user_location,
                                   USER.created_at            AS user_created_at,
                                   USER.updated_at            AS user_updated_at
                            FROM   franklin_object
                                   INNER JOIN USER
                                           ON franklin_object.id = USER.id) AS anon_3
                        ON anon_3.user_id = likers_1.user_id
           LEFT OUTER JOIN contents_franklin_objects AS contents_franklin_objects_2
                        ON franklin_object.id = contents_franklin_objects_2.franklin_object_id
           LEFT OUTER JOIN (SELECT franklin_object.id                    AS franklin_object_id,
                                   franklin_object.type                  AS franklin_object_type,
                                   franklin_object.uuid                  AS franklin_object_uuid,
                                   content_text_block.id                 AS content_text_block_id,
                                   content_text_block.text               AS content_text_block_text,
                                   content_text_block.position           AS content_text_block_position,
                                   content_text_block.franklin_object_id AS content_text_block_franklin_object_id,
                                   content_text_block.created_at         AS content_text_block_created_at,
                                   content_text_block.updated_at         AS content_text_block_updated_at
                            FROM   franklin_object
                                   INNER JOIN content_text_block
                                           ON franklin_object.id = content_text_block.id) AS anon_4
                        ON anon_4.content_text_block_id = contents_franklin_objects_2.content_id
           LEFT OUTER JOIN (SELECT franklin_object.id         AS franklin_object_id,
                                   franklin_object.type       AS franklin_object_type,
                                   franklin_object.uuid       AS franklin_object_uuid,
                                   stream_item.id             AS stream_item_id,
                                   stream_item.parent_id      AS stream_item_parent_id,
                                   stream_item.shown_at       AS stream_item_shown_at,
                                   stream_item.author_id      AS stream_item_author_id,
                                   stream_item.stream_sort_at AS stream_item_stream_sort_at,
                                   stream_item.PUBLIC         AS stream_item_public,
                                   stream_item.created_at     AS stream_item_created_at,
                                   stream_item.updated_at     AS stream_item_updated_at
                            FROM   franklin_object
                                   INNER JOIN stream_item
                                           ON franklin_object.id = stream_item.id) AS anon_6
                        ON anon_6.stream_item_parent_id = franklin_object.id
           LEFT OUTER JOIN likers AS likers_2
                        ON anon_6.stream_item_id = likers_2.post_id
           LEFT OUTER JOIN (SELECT franklin_object.id         AS franklin_object_id,
                                   franklin_object.type       AS franklin_object_type,
                                   franklin_object.uuid       AS franklin_object_uuid,
                                   USER.id                    AS user_id,
                                   USER.email                 AS user_email,
                                   USER.password              AS user_password,
                                   USER.auth_token            AS user_auth_token,
                                   USER.auth_token_expiration AS user_auth_token_expiration,
                                   USER.active                AS user_active,
                                   USER.activation_token      AS user_activation_token,
                                   USER.first_name            AS user_first_name,
                                   USER.last_name             AS user_last_name,
                                   USER.image                 AS user_image,
                                   USER.bio                   AS user_bio,
                                   USER.aspirations           AS user_aspirations,
                                   USER.website               AS user_website,
                                   USER.resume                AS user_resume,
                                   USER.resume_name           AS user_resume_name,
                                   USER.primary_role          AS user_primary_role,
                                   USER.institution_id        AS user_institution_id,
                                   USER.birth_date            AS user_birth_date,
                                   USER.gender                AS user_gender,
                                   USER.graduation_year       AS user_graduation_year,
                                   USER.complete              AS user_complete,
                                   USER.masthead_y_position   AS user_masthead_y_position,
                                   USER.masthead              AS user_masthead,
                                   USER.fb_access_token       AS user_fb_access_token,
                                   USER.fb_user_id            AS user_fb_user_id,
                                   USER.location              AS user_location,
                                   USER.created_at            AS user_created_at,
                                   USER.updated_at            AS user_updated_at
                            FROM   franklin_object
                                   INNER JOIN USER
                                           ON franklin_object.id = USER.id) AS anon_5
                        ON anon_5.user_id = likers_2.user_id
    WHERE  stream_item.parent_id = 11
    ORDER  BY stream_item.stream_sort_at DESC,
              anon_1.content_text_block_position,
              anon_6.stream_item_stream_sort_at DESC 
    

    EXPLAIN输出:

    ID   SELECT_TYPE   TABLE    POSSIBLY_KEYS KEY KEY_LEN REF ROWS EXTRA
    1   PRIMARY <derived2>  ALL NULL    NULL    NULL    NULL    599 Using     temporary; Using filesort
    1   PRIMARY stream_item eq_ref  PRIMARY,parent_id   PRIMARY 4   anon_1.content_text_block_franklin_object_id    1   Using where
    1   PRIMARY contents_resources_1    ref content_id  content_id  5    anon_1.content_text_block_id   2   
    1   PRIMARY <derived3>  ALL NULL    NULL    NULL    NULL    7   
    1   PRIMARY contents_franklin_objects_1 ref content_id  content_id  5   anon_1.content_text_block_id    1   
    1   PRIMARY franklin_object eq_ref  PRIMARY PRIMARY 4   franklin.stream_item.id 1   Using where
    1   PRIMARY franklin_object_1   eq_ref  PRIMARY PRIMARY 4   franklin.contents_franklin_objects_1.franklin_object_id 1   
    1   PRIMARY likers_1    ref post_id post_id 5   franklin.stream_item.id 1
    1   PRIMARY <derived4>  ALL NULL    NULL    NULL    NULL    136 
    1   PRIMARY contents_franklin_objects_2 ref franklin_object_id  franklin_object_id  5   franklin.stream_item.id 1   
    1   PRIMARY <derived5>  ALL NULL    NULL    NULL    NULL    599 
    1   PRIMARY <derived6>  ALL NULL    NULL    NULL    NULL    608 
    1   PRIMARY likers_2    ref post_id post_id 5   anon_6.stream_item_id   1   
    1   PRIMARY <derived7>  ALL NULL    NULL    NULL    NULL    136 
    7   DERIVED user    ALL PRIMARY NULL    NULL    NULL    133 
    7   DERIVED franklin_object eq_ref  PRIMARY PRIMARY 4   franklin.user.id    1   
    6   DERIVED stream_item ALL PRIMARY NULL    NULL    NULL    709 
    6   DERIVED franklin_object eq_ref  PRIMARY PRIMARY 4   franklin.stream_item.id 1   
    5   DERIVED content_text_block  ALL PRIMARY NULL    NULL    NULL    666 
    5   DERIVED franklin_object eq_ref  PRIMARY PRIMARY 4   franklin.content_text_block.id        1 
    4   DERIVED user    ALL PRIMARY NULL    NULL    NULL    133 
    4   DERIVED franklin_object eq_ref  PRIMARY PRIMARY 4   franklin.user.id    1   
    3   DERIVED resource    ALL PRIMARY NULL    NULL    NULL    7   
    3   DERIVED franklin_object eq_ref  PRIMARY PRIMARY 4   franklin.resource.id    1   
    2   DERIVED content_text_block  ALL PRIMARY NULL    NULL    NULL    666 
    2   DERIVED franklin_object eq_ref  PRIMARY PRIMARY 4   franklin.content_text_block.id  1   
    

    如何将所有查询减少到更快的速度?有什么其他方法可以加快这个速度?

    franklin_objects的设置方式是反模式吗?它的工作方式是franklin_object表有两列:id和type。然后每个类型都是一个表,主键是franklin_object的外键。

    生成sql的代码类似于:

    stream_item_query = StreamItem.query.options(db.joinedload('stream_items'),db.joinedload('contents_included_in'),db.joinedload('contents.resources'),db.joinedload('contents.objects'),db.subqueryload('likers'))

    stream_items = stream_item_query.filter(StreamItem.parent_id == community_id).order_by(db.desc(StreamItem.stream_sort_at)).all()

1 个答案:

答案 0 :(得分:7)

哇,这个伤害了我的大脑一点点。试图找出查询正在做什么,所有表是什么,以及关系是乏味的。如果你有类似的经历,那么这就是你可能在这个单一查询中尝试做太多的第一个暗示。

我的建议是重新考虑你的整个方法。

SQLAlchemy是一个非常好的工具,我不会打击它(或者你选择的mysql),但是和大多数ORM工具一样,你需要考虑使用它们的成本。一个例子是这个franklin_object表业务。这是反模式吗?是和否。从纯粹的OO角度来看它是有道理的。您可以通过在此表中查找id来确定要查询的表。从关系查询的角度来看,它的用途很少。我可以从您的查询中删除franklin_object的每个实例,但只会遗失franklin_object中的列。如果这是一个可行的选择,我会立即这样做。

让我们进一步检查这个与franklin_object的链接。查看子查询,它们都具有相同的形式:

  SELECT franklin_object.id           AS franklin_object_id,
         franklin_object.type         AS franklin_object_type,
         franklin_object.uuid         AS franklin_object_uuid,
         linked_table.id              AS linked_table_id,
         linked_table.col2            AS col2 --and more
  FROM   franklin_object
  INNER JOIN linked_table
         ON franklin_object.id = linked_table.id) AS anon_n

关于如何优化查询的这一部分,数据库没有太多信息,无论统计数据如何。也许如果通过在franklin_object子句中指定type来限制where,则查询会更好。也许。

这对USER表尤其有问题,因为这个表有很多记录(所以你说)。由于您要查询大多数列,并且优化器无法准确计算出将检索的行数,因此执行全表扫描是有意义的。在你的情况下,两次。

另一个方面是涉及的连接数量很多。如果我们取出所有franklin_object引用,仍然有11个连接。如果您的数据模型更具关系性,那并不是很糟糕,但事实并非如此。生成的查询对数据库没有太大帮助,无法找出执行查询的最佳方法,因此它不能很好地完成任务。也许你可以用提示等来缓解这个问题,但我敢打赌,从长远来看,这会让你感到厌烦。

您正在使用ORM工具,因此确实使用它。通过一次完成如此大的查询,您无法获得任何收益。它可能会因性能而分开一些。执行延迟检索以避免大量复杂的查询。我会说尝试,只是为了看看它是怎么回事,懒洋洋地做所有事情。性能可能会好,我会说更好。不太好,可能甚至不能接受,但比数据库搅拌时能喝咖啡更好。

然后,开始将事物拼凑成更精简的块。将逻辑上有意义的对象绑定在一起,例如resourcecontents_resources。另一个例子,stream_itemlikersuser之间的连接是重复的。制作一个查询并让SQLAlchemy做它的事情。

作为最后的手段,可以实现某种缓存机制。也许在某处对表格进行非规范化。在缓慢变化,读取繁重的系统上,您可以将这些表格输入到另一个结构中,其中查询是直接且快速的。也就是说,预先进行处理并将其存储在单个表中。

祝你好运