在sqlalchemy中选择加入会产生太多行

时间:2015-02-18 13:04:14

标签: python sql postgresql sqlalchemy

我正在尝试构建一个复合SQL查询,该查询根据我之前执行的连接构建表。 (使用SqlAlchemy(核心部分)与python3和Postgresql 9.4)

我在这里包含了我的python3代码的相关部分。我首先使用带有group_by的select创建“in_uuid_set”。然后我用“in_off_messages”加入“in_uuid_set”来获得“jn_in”。 最后,我尝试通过选择和生成所需列来从“jn_in”构建一个新表“incoming”:

in_uuid_set = \
    sa.select([in_off_messages.c.src_uuid.label('remote_uuid')])\
    .select_from(in_off_messages)\
    .where(in_off_messages.c.dst_uuid == local_uuid)\
    .group_by(in_off_messages.c.src_uuid)\
    .alias()


jn_in = in_uuid_set.join(in_off_messages,\
    and_(\
        in_off_messages.c.src_uuid == in_uuid_set.c.remote_uuid,\
        in_off_messages.c.dst_uuid == local_uuid,\
        ))\
    .alias()


incoming = sa.select([\
    in_off_messages.c.msg_uuid.label('msg_uuid'),\
    in_uuid_set.c.remote_uuid.label('remote_uuid'),\
    in_off_messages.c.msg_type.label('msg_type'),\
    in_off_messages.c.date_sent.label('date_sent'),\
    in_off_messages.c.content.label('content'),\
    in_off_messages.c.was_read.label('was_read'),\
    true().label('is_incoming')]
    )\
    .select_from(jn_in)

令人惊讶的是,我认为“传入”的行数多于“jn_in”。 “incoming”有12行,而“jn_in”只有2行。我希望“传入”将具有与“jn_in”相同的行数(2)。

我还在这里包含了SqlAlchemy为“incoming”生成的SQL输出:

SELECT in_off_messages.msg_uuid  AS msg_uuid,
       anon_1.remote_uuid        AS remote_uuid,
       in_off_messages.msg_type  AS msg_type,
       in_off_messages.date_sent AS date_sent,
       in_off_messages.content   AS content,
       in_off_messages.was_read  AS was_read,
       1                         AS is_incoming
FROM   in_off_messages,
       (SELECT in_off_messages.src_uuid AS remote_uuid
        FROM   in_off_messages
        WHERE  in_off_messages.dst_uuid = :dst_uuid_1
        GROUP  BY in_off_messages.src_uuid) AS anon_1,
       (SELECT anon_1.remote_uuid            AS anon_1_remote_uuid,
               in_off_messages.msg_uuid      AS in_off_messages_msg_uuid,
               in_off_messages.orig_src_uuid AS in_off_messages_orig_src_uuid,
               in_off_messages.src_uuid      AS in_off_messages_src_uuid,
               in_off_messages.dst_uuid      AS in_off_messages_dst_uuid,
               in_off_messages.msg_type      AS in_off_messages_msg_type,
               in_off_messages.date_sent     AS in_off_messages_date_sent,
               in_off_messages.content       AS in_off_messages_content,
               in_off_messages.was_read      AS in_off_messages_was_read
        FROM   (SELECT in_off_messages.src_uuid AS remote_uuid
                FROM   in_off_messages
                WHERE  in_off_messages.dst_uuid = :dst_uuid_1
                GROUP  BY in_off_messages.src_uuid) AS anon_1
               JOIN in_off_messages
                 ON in_off_messages.src_uuid = anon_1.remote_uuid
                    AND in_off_messages.dst_uuid = :dst_uuid_2) AS anon_2 

使用此SQL输出时,某些内容对我来说不合适,主要是因为我看到GROUP BY的次数太多了。我原以为它会出现一次,但似乎它在这里出现了两次。

我的猜测是,某些支撑不合适(在生成的SQL中)。我也怀疑我做了别名()事情的错误,虽然我不确定。

我该怎么做才能获得想要的结果(“jn_in”和“incoming”的行数相同)?

1 个答案:

答案 0 :(得分:0)

在玩了一段时间的代码后,我找到了解决问题的方法。 答案最终与别名()有关。 为了使这个工作,第二个别名()(Of jn_in)应该省略,如下所示:

in_uuid_set = \
    sa.select([in_off_messages.c.src_uuid.label('remote_uuid')])\
    .select_from(in_off_messages)\
    .where(in_off_messages.c.dst_uuid == local_uuid)\
    .group_by(in_off_messages.c.src_uuid)\
    .alias()


jn_in = in_uuid_set.join(in_off_messages,\
    and_(\
        in_off_messages.c.src_uuid == in_uuid_set.c.remote_uuid,\
        in_off_messages.c.dst_uuid == local_uuid,\
        ))
# <<< The alias() is gone >>>



incoming = sa.select([\
    in_off_messages.c.msg_uuid.label('msg_uuid'),\
    in_uuid_set.c.remote_uuid.label('remote_uuid'),\
    in_off_messages.c.msg_type.label('msg_type'),\
    in_off_messages.c.date_sent.label('date_sent'),\
    in_off_messages.c.content.label('content'),\
    in_off_messages.c.was_read.label('was_read'),\
    true().label('is_incoming')]
    )\
    .select_from(jn_in)

然而,似乎无法省略第一个别名()(in_uuid_set)。如果我尝试省略它,我收到此错误消息:

E               subquery in FROM must have an alias
E               LINE 2: FROM (SELECT in_off_messages.src_uuid AS remote_uuid 
E                            ^
E               HINT:  For example, FROM (SELECT ...) [AS] foo.

作为对此的概括,可能如果你有一个你想要作为子句放在其他地方的select,那么你想要别名()它,但是如果你有一个你想要作为子句放入的连接,你不应该别名()它。

为了完整起见,我在这里包含了新代码的结果SQL:

SELECT in_off_messages.msg_uuid  AS msg_uuid,
       anon_1.remote_uuid        AS remote_uuid,
       in_off_messages.msg_type  AS msg_type,
       in_off_messages.date_sent AS date_sent,
       in_off_messages.content   AS content,
       in_off_messages.was_read  AS was_read,
       1                         AS is_incoming
FROM   (SELECT in_off_messages.src_uuid AS remote_uuid
        FROM   in_off_messages
        WHERE  in_off_messages.dst_uuid = :dst_uuid_1
        GROUP  BY in_off_messages.src_uuid) AS anon_1
       JOIN in_off_messages
         ON in_off_messages.src_uuid = anon_1.remote_uuid
            AND in_off_messages.dst_uuid = :dst_uuid_2  

比问题上的短得多。