SQLAlchemy连接两个模型并选择顶行

时间:2015-09-16 21:54:41

标签: python flask sqlalchemy flask-sqlalchemy

目前我有两个表:服务器和扫描。
可以有一个服务器进行多次扫描(一对多关系)。

我想要实现的是选择一个服务器,然后只选择与该服务器关联的第一个扫描。以下查询:

query = db.session.query(models.Server, models.Scan).outerjoin(models.Server.scans).all()

输出:

(<Server u'Testing'>, <Scan u'bbd4f805-3966-d464-b2d1-0079eb89d69708c3a05ec2812bcf'>)
(<Server u'Testing'>, <Scan u'bbd4f805-3966-d464-b2d1-0079eb89d69708c3a05ec2812bcf'>)
(<Server u'Testing'>, <Scan u'testscan'>)
(<Server u'fasd'>, <Scan u'testscan'>)
(<Server u'fdaafas'>, None)

而我只想要一个“Testing”服务器和最近的扫描。

附加

当我像这样循环查询时:

for a in query:
    print a, a.scans.all()

输出结果为:

<Server u'Testing'> [<Scan u'testscan'>, <Scan u'bbd4f805-3966-d464-b2d1-0079eb89d69708c3a05ec2812bcf'>, <Scan u'bbd4f805-3966-d464-b2d1-0079eb89d69708c3a05ec2812bcf'>]
<Server u'fasd'> [<Scan u'testscan'>]
<Server u'fdaafas'> []

我想要的输出应该相等:

<Server u'Testing'> [<Scan u'bbd4f805-3966-d464-b2d1-0079eb89d69708c3a05ec2812bcf'>]
<Server u'fasd'> [<Scan u'testscan'>]
    <Server u'fdaafas'> []

1 个答案:

答案 0 :(得分:0)

您需要使用某些条件添加一个子查询,在该子查询中选择要显示的Scan寄存器。对于下面的玩具示例,我假设您需要某个参数的最大值。

我已经创建了表格AB; A对应ServerBScan

In [2]:

class A(Base):
    __tablename__ = 'A'
​
    pk = Column('pk', Integer, primary_key=True)
    name = Column('name', String)

class B(Base):
    __tablename__ = 'B'
​
    pk = Column('pk', Integer, primary_key=True)
    fk = Column('fk', Integer, ForeignKey('A.pk'))
    attr = Column('attr', Integer)
​
    a = relationship("A", backref='B')

插入一些数据,

In [10]:

q = session.query(B)
print(q)
for x in q.all():
    print(x.pk, x.fk, x.attr)

q = session.query(A)
print(q)
for x in q.all():
    print(x.pk, x.name)
​
SELECT "B".pk AS "B_pk", "B".fk AS "B_fk", "B".attr AS "B_attr" 
FROM "B"
1 1 1
2 1 2
3 2 0
4 2 4
5 1 4
SELECT "A".pk AS "A_pk", "A".name AS "A_name" 
FROM "A"
1 one
2 two

并解决了您添加子查询的问题,该子查询为每B.attr选择B.fk的最大值,即每A.pk。 (在您的示例中,对于每个Scan.attr,它将是最大Server。)

In [13]:


from sqlalchemy import func
from sqlalchemy import tuple_
​
s = session.query(func.max(B.attr), B.fk).group_by(B.fk)
print(s)
q = session.query(A, B).outerjoin(B).filter(tuple_(B.attr, B.fk).in_(s))
print(q)
for x in q.all():
    print(x.A.pk, x.A.name, x.B.pk, x.B.attr)

SELECT max("B".attr) AS max_1, "B".fk AS "B_fk" 
FROM "B" GROUP BY "B".fk
SELECT "A".pk AS "A_pk", "A".name AS "A_name", "B".pk AS "B_pk", "B".fk AS "B_fk", "B".attr AS "B_attr" 
FROM "A" LEFT OUTER JOIN "B" ON "A".pk = "B".fk 
WHERE ("B".attr, "B".fk) IN (SELECT max("B".attr) AS max_1, "B".fk AS "B_fk" 
FROM "B" GROUP BY "B".fk)
2 two 4 4
1 one 5 4

注意:您没有提到您正在使用的数据库,但为了防万一,请注意in_多个列的sqlite语句不起作用(这很烦人)当你尝试它)。但是,如果你只使用一列,比如,

s = session.query(func.max(B.attr)).group_by(B.fk)
q = session.query(A, B).outerjoin(B).filter(B.attr.in_(s))

但是根据您的数据,每个A可以获得多个B(例如B.fk = 1的最大值(B.attr)= 3,而B.fk = 2的最大值( B.attr)= 4但B.attr = 3,B.fk = 2,B.attr = 3,B.attr = 4。

但是,如果您用来选择最大值的属性是唯一的,那就没问题了。无论如何,如果您使用postgresoracle这样的数据库,则可以使用包含多列的in_

希望它有所帮助。

评论后添加了EDIT: 如果您想在没有Servers的情况下获得Scan,则只需在查询中添加or_

In [18]:

from sqlalchemy import func
from sqlalchemy import tuple_
from sqlalchemy import or_
​
s = session.query(func.max(B.attr), B.fk).group_by(B.fk)
q = session.query(A, B).outerjoin(B).filter(or_(tuple_(B.attr, B.fk).in_(s), B.fk==None))
print(q)
for x in q.all():
    if x.B:
        print(x.A.pk, x.A.name, x.B.pk, x.B.attr)
    else:
        print(x.A.pk, x.A.name)
​
SELECT "A".pk AS "A_pk", "A".name AS "A_name", "B".pk AS "B_pk", "B".fk AS "B_fk", "B".attr AS "B_attr" 
FROM "A" LEFT OUTER JOIN "B" ON "A".pk = "B".fk 
WHERE ("B".attr, "B".fk) IN (SELECT max("B".attr) AS max_1, "B".fk AS "B_fk" 
FROM "B" GROUP BY "B".fk) OR "B".fk IS NULL
2 two 4 4
1 one 5 4
3 three

如您所见,您必须小心使用空值。请注意,outerjoin已经执行了left join,这是您所需要的,但由于filter,您必须明确说明您也想要空行。通常,AServerBScan。很抱歉没有使用您的表名,这使得阅读更加困难。