我有两个表(foo和bar,每个有超过40000行),一个引用另一个,有多对一的joinload关系,我想得到一个总和(基于foo中的uid,即foo表中的许多行可以具有foo表引用的表中两列的相同u_id):
class Foo(Base):
__tablename__ = 'foo'
id = Column(Integer, nullable=False, primary_key=True)
u_id = Column(String(20), nullable=False)
name = Column(String(30), nullable=False)
bar_stats = relationship(
"BarStats", uselist=False, backref="foo", lazy="joined",
cascade="all, delete-orphan")
class BarStats(Base):
__tablename__ = 'bar_stats'
foo_id = Column(Integer, nullable=False, primary_key=True)
in_val = Column(BigInteger, nullable=True)
out_val = Column(BigInteger, nullable=True)
ts = Column(BigInteger, nullable=False)
现在,我希望得到in_val和out_val的总和,并按u_id对它们进行分组。我试图通过避免由于不必要的计算导致的延迟来减少计算时间。有人可以告诉我,如果有比这两个更好的方法:
foo_usage = (BarStats.out_val or 0 + BarStats.in_val or 0).label('usage_by_foo_id') # Barstats does not exist, usage = 0.
stm = session.query(BarStats.foo_id, foo_usage)
stm = stm.filter(foo_usage != None).subquery()
query_foo = session.query(foo.u_id, func.sum(stm.c.usage_by_foo_id).label('id_usage')).\
join(stm).filter(foo.id==stm.c.foo_id).\
group_by(foo.u_id)
query_foo = apply_filters(query_foo, filters)
我能想到的另一种方式是:
query_foo = session.query(Foo) # Joined load
foos = query_foo.all()
for i in foos:
f = (i.bar_stats.out_val or 0) + (i.bar_stats.in_val or 0) if (i.bar_stats and i.bar_stats.ts) else 0 #ts should also exist.
usage_dict[i.u_id]=usage_dict.get(i.i_id, 0)+f
有人可以告诉我是否有更好的方法可以做到这一点?