如何在SQLAlchemy中按多对多关系计数?

时间:2014-04-05 05:24:36

标签: python sqlalchemy flask-sqlalchemy

我想通过桌子上的多对多关系来计算。

我的出发点是对this线程的(优秀)响应。但是,在我的情况下,涉及第三个表,我需要我的backrefs'加入'而不是'动态'。

当我尝试解决该问题的解决方案时,这两个更改的组合会导致编程错误。

这是我的设置。我在User和Post之间拥有相同的多对多关系。 Post和Heading之间也有一对多的关系。我想要的是按照每个帖子的用户数量排序的所有帖子的列表。

from flask import Flask
from flask.ext.sqlalchemy import SQLAlchemy
from sqlalchemy.orm import relationship, backref
from sqlalchemy import func

app = Flask(__name__)
app.config['SQLALCHEMY_DATABASE_URI'] = 'postgres://xxx'
db = SQLAlchemy(app)


likes = db.Table('likes',
                 db.Column('user_id', db.Integer, db.ForeignKey('user.id')),
                 db.Column('post_id', db.Integer, db.ForeignKey('post.id')))


class User(db.Model):
    id = db.Column(db.Integer, primary_key=True)
    username = db.Column(db.String(20))

    def __repr__(self):
        return "<User('%s')>" % self.username


class Post(db.Model):
    id = db.Column(db.Integer, primary_key=True)
    title = db.Column(db.String(255))
    likes = db.relationship('User', secondary=likes,
                            backref=db.backref('posts', lazy='joined'))

    def __repr__(self):
        return "<Post('%s')>" % self.title


class Heading(db.Model):
    id = db.Column(db.Integer, primary_key=True)
    name = db.Column(db.String(255))

    post_id = db.Column(db.Integer, db.ForeignKey('post.id'))
    post = relationship('Post', backref=backref('headings', lazy='joined'))

    def __repr__(self):
        return "<Category('%s')>" % self.name

# Three users
user1 = User(username='user1')
user2 = User(username='user2')
user3 = User(username='user3')

# Two posts
post1 = Post(title='post1')
heading1 = Heading(name='heading1')
heading2 = Heading(name='heading2')

post1.headings = [heading1, heading2]
post1.likes = [user1, user2]

post2 = Post(title='post2')
heading3 = Heading(name='heading3')
post2.headings = [heading3]
post2.likes = [user3]

db.create_all()
db.session.add_all([user1, user2, user3, post1, post2, heading1, heading2, heading3])
db.session.commit()

现在,如果我在另一个问题中尝试解决方案:

In [3]: db.session.query(Post, func.count(likes.c.user_id).label('total')).join(likes).group_by(Post).order_by('total DESC').all()
---------------------------------------------------------------------------
ProgrammingError                          Traceback (most recent call last)
<ipython-input-3-b804fb4b0893> in <module>()
----> 1 db.session.query(Post, func.count(likes.c.user_id).label('total')).join(likes).group_by(Post).order_by('total DESC').all()

ProgrammingError: (ProgrammingError) column "heading_1.id" must appear in the GROUP BY clause or be used in an aggregate function
LINE 1: ...tle AS post_title, count(likes.user_id) AS total, heading_1....
                                                             ^
 'SELECT post.id AS post_id, post.title AS post_title, count(likes.user_id) AS total, heading_1.id AS heading_1_id, heading_1.name AS heading_1_name, heading_1.post_id AS heading_1_post_id \nFROM post JOIN likes ON post.id = likes.post_id LEFT OUTER JOIN heading AS heading_1 ON post.id = heading_1.post_id GROUP BY post.id, post.title ORDER BY total DESC' {}

Naively(对于SQL和SQLAlchemy来说,我很天真),如果我尝试将Heading添加到我的group_by:

In [6]: db.session.query(Post, func.count(likes.c.user_id).label('total')).join(likes).group_by(Post, Heading).order_by('total DESC').all()
---------------------------------------------------------------------------
ProgrammingError                          Traceback (most recent call last)
<ipython-input-6-8efd6a8314f8> in <module>()
----> 1 db.session.query(Post, func.count(likes.c.user_id).label('total')).join(likes).group_by(Post, Heading).order_by('total DESC').all()

ProgrammingError: (ProgrammingError) invalid reference to FROM-clause entry for table "heading"
LINE 2: ...= heading_1.post_id GROUP BY post.id, post.title, heading.id...
                                                             ^
HINT:  Perhaps you meant to reference the table alias "heading_1".
 'SELECT post.id AS post_id, post.title AS post_title, count(likes.user_id) AS total, heading_1.id AS heading_1_id, heading_1.name AS heading_1_name, heading_1.post_id AS heading_1_post_id \nFROM post JOIN likes ON post.id = likes.post_id LEFT OUTER JOIN heading AS heading_1 ON post.id = heading_1.post_id GROUP BY post.id, post.title, heading.id, heading.name, heading.post_id ORDER BY total DESC' {}

如何让我想要的查询工作?我可能正在做一些非常愚蠢的事情。

1 个答案:

答案 0 :(得分:3)

正如您所提到的,上述解决方案不起作用的真正原因是标题的joinedload。你可以做的很少事情:

选项-1:禁用此查询的joinedload

q = (db.session.query(Post, func.count(likes.c.user_id).label("total"))
        .options(lazyload(Post.headings)) # disable joined-load
        .join(likes) .group_by(Post) .order_by('total DESC')
    )

如果您确实需要标题,可以改为使用subqueryload

q = (db.session.query(Post, func.count(likes.c.user_id).label("total"))
        .options(subqueryload(Post.headings))
        .join(likes) .group_by(Post) .order_by('total DESC')
    )

选项-2:使用子查询

subq = (db.session.query( Post.id.label("post_id"),
            func.count(likes.c.user_id).label("num_likes"))
            .outerjoin(likes).group_by(Post.id)
        ).subquery("subq")

q = (db.session.query(Post, subq.c.num_likes)
        .join(subq, Post.id == subq.c.post_id)
        .group_by(Post) .order_by(subq.c.num_likes.desc())
    )

选项-3:使用Hybrid Attributes

它可能不是最有效的,但保持清洁。

将以下内容添加到Post模型中:

from sqlalchemy.ext.hybrid import hybrid_property
class Post(db.Model):

    # ...

    @hybrid_property
    def num_likes(self):
        return len(self.likes)

    @num_likes.expression
    def _num_likes_expression(cls):
        return (db.select([db.func.count(likes.c.user_id).label("num_likes")])
                .where(likes.c.post_id == cls.id)
                .label("total_likes")
                )

稍后您可以以干净的方式撰写查询:

q = db.session.query(Post, Post.num_likes).order_by(Post.num_likes.desc())
# use this if you need it data sorted, but do not care how many likes are there 
#q = db.session.query(Post).order_by(Post.num_likes.desc())