目前我有两个表:服务器和扫描。
可以有一个服务器进行多次扫描(一对多关系)。
我想要实现的是选择一个服务器,然后只选择与该服务器关联的第一个扫描。以下查询:
query = db.session.query(models.Server, models.Scan).outerjoin(models.Server.scans).all()
输出:
(<Server u'Testing'>, <Scan u'bbd4f805-3966-d464-b2d1-0079eb89d69708c3a05ec2812bcf'>)
(<Server u'Testing'>, <Scan u'bbd4f805-3966-d464-b2d1-0079eb89d69708c3a05ec2812bcf'>)
(<Server u'Testing'>, <Scan u'testscan'>)
(<Server u'fasd'>, <Scan u'testscan'>)
(<Server u'fdaafas'>, None)
而我只想要一个“Testing
”服务器和最近的扫描。
附加
当我像这样循环查询时:
for a in query:
print a, a.scans.all()
输出结果为:
<Server u'Testing'> [<Scan u'testscan'>, <Scan u'bbd4f805-3966-d464-b2d1-0079eb89d69708c3a05ec2812bcf'>, <Scan u'bbd4f805-3966-d464-b2d1-0079eb89d69708c3a05ec2812bcf'>]
<Server u'fasd'> [<Scan u'testscan'>]
<Server u'fdaafas'> []
我想要的输出应该相等:
<Server u'Testing'> [<Scan u'bbd4f805-3966-d464-b2d1-0079eb89d69708c3a05ec2812bcf'>]
<Server u'fasd'> [<Scan u'testscan'>]
<Server u'fdaafas'> []
答案 0 :(得分:0)
您需要使用某些条件添加一个子查询,在该子查询中选择要显示的Scan
寄存器。对于下面的玩具示例,我假设您需要某个参数的最大值。
我已经创建了表格A
和B
; A
对应Server
和B
对Scan
。
In [2]:
class A(Base):
__tablename__ = 'A'
pk = Column('pk', Integer, primary_key=True)
name = Column('name', String)
class B(Base):
__tablename__ = 'B'
pk = Column('pk', Integer, primary_key=True)
fk = Column('fk', Integer, ForeignKey('A.pk'))
attr = Column('attr', Integer)
a = relationship("A", backref='B')
插入一些数据,
In [10]:
q = session.query(B)
print(q)
for x in q.all():
print(x.pk, x.fk, x.attr)
q = session.query(A)
print(q)
for x in q.all():
print(x.pk, x.name)
SELECT "B".pk AS "B_pk", "B".fk AS "B_fk", "B".attr AS "B_attr"
FROM "B"
1 1 1
2 1 2
3 2 0
4 2 4
5 1 4
SELECT "A".pk AS "A_pk", "A".name AS "A_name"
FROM "A"
1 one
2 two
并解决了您添加子查询的问题,该子查询为每B.attr
选择B.fk
的最大值,即每A.pk
。 (在您的示例中,对于每个Scan.attr
,它将是最大Server
。)
In [13]:
from sqlalchemy import func
from sqlalchemy import tuple_
s = session.query(func.max(B.attr), B.fk).group_by(B.fk)
print(s)
q = session.query(A, B).outerjoin(B).filter(tuple_(B.attr, B.fk).in_(s))
print(q)
for x in q.all():
print(x.A.pk, x.A.name, x.B.pk, x.B.attr)
SELECT max("B".attr) AS max_1, "B".fk AS "B_fk"
FROM "B" GROUP BY "B".fk
SELECT "A".pk AS "A_pk", "A".name AS "A_name", "B".pk AS "B_pk", "B".fk AS "B_fk", "B".attr AS "B_attr"
FROM "A" LEFT OUTER JOIN "B" ON "A".pk = "B".fk
WHERE ("B".attr, "B".fk) IN (SELECT max("B".attr) AS max_1, "B".fk AS "B_fk"
FROM "B" GROUP BY "B".fk)
2 two 4 4
1 one 5 4
注意:您没有提到您正在使用的数据库,但为了防万一,请注意in_
多个列的sqlite
语句不起作用(这很烦人)当你尝试它)。但是,如果你只使用一列,比如,
s = session.query(func.max(B.attr)).group_by(B.fk)
q = session.query(A, B).outerjoin(B).filter(B.attr.in_(s))
但是根据您的数据,每个A可以获得多个B(例如B.fk
= 1的最大值(B.attr
)= 3,而B.fk
= 2的最大值( B.attr
)= 4但B.attr
= 3,B.fk
= 2,B.attr
= 3,B.attr
= 4。
但是,如果您用来选择最大值的属性是唯一的,那就没问题了。无论如何,如果您使用postgres
或oracle
这样的数据库,则可以使用包含多列的in_
。
希望它有所帮助。
评论后添加了EDIT:
如果您想在没有Servers
的情况下获得Scan
,则只需在查询中添加or_
。
In [18]:
from sqlalchemy import func
from sqlalchemy import tuple_
from sqlalchemy import or_
s = session.query(func.max(B.attr), B.fk).group_by(B.fk)
q = session.query(A, B).outerjoin(B).filter(or_(tuple_(B.attr, B.fk).in_(s), B.fk==None))
print(q)
for x in q.all():
if x.B:
print(x.A.pk, x.A.name, x.B.pk, x.B.attr)
else:
print(x.A.pk, x.A.name)
SELECT "A".pk AS "A_pk", "A".name AS "A_name", "B".pk AS "B_pk", "B".fk AS "B_fk", "B".attr AS "B_attr"
FROM "A" LEFT OUTER JOIN "B" ON "A".pk = "B".fk
WHERE ("B".attr, "B".fk) IN (SELECT max("B".attr) AS max_1, "B".fk AS "B_fk"
FROM "B" GROUP BY "B".fk) OR "B".fk IS NULL
2 two 4 4
1 one 5 4
3 three
如您所见,您必须小心使用空值。请注意,outerjoin
已经执行了left join
,这是您所需要的,但由于filter
,您必须明确说明您也想要空行。通常,A
为Server
,B
为Scan
。很抱歉没有使用您的表名,这使得阅读更加困难。