SQLAlchemy:使用复合主键为单个表中的所有ID选择最新行

时间:2019-06-19 15:44:46

标签: python sqlalchemy composite-primary-key

我想做this,但在 SQLAlchemy。唯一的不同是,我希望不仅能够获取最新记录,还希望能够在获取之前获得最新记录。 给定时间戳记。只要我确保行不会被删除,就可以让我以特定时间戳查看数据库。

假设我的模型如下:

from datetime import datetime
from sqlalchemy import Column, Integer, DateTime
from sqlalchemy.ext.declarative include declarative_base
Base = declarative_base()
class User(Base):
    __tablename__ = "users"
    id_ = Column("id", Integer, primary_key=True, index=True, nullable=False)
    timestamp = Column(DateTime, primary_key=True, index=True, nullable=False, default=datetime.utcnow())
    # other non-primary attributes would go here

我有这个users表(简化了时间戳记):

| id_ | timestamp |
-------------------
  0     1
  0     4
  0     6
  1     3
  2     7
  2     3

例如,如果我在timestamp = 4处请求快照,我想获取:

| id_ | timestamp |
-------------------
  0     4
  1     3
  2     3

我能想到的最好的方法是按程序进行:

from sqlalchemy import create_engine
from sqlalchemy.orm import sessionmaker
db_engine = create_engine(...)
SessionLocal = sessionmaker(bind=db_engine, ...)
db_session = SessionLocal()

def get_snapshot(timestamp: datetime):
    all_versions = db_session.query(User).filter(User.timestamp <= timestamp).order_by(desc(User.timestamp))
    snapshot = []
    for v in all_versions:
        if v.id_ not in (i.id_ for i in snapshots):
            snapshot.append(v)
    return snapshot

但是,这给了我一个模型对象的列表,而不是sqlalchemy.orm.query.Query,所以我必须将结果与标准查询中的对待不同 我代码的其他部分。可以在ORM中全部完成吗?

预先感谢

2 个答案:

答案 0 :(得分:-1)

您尝试过吗:

all_versions = db_session.query(User, func.max(User.timestamp)).\
               filter(User.timestamp <= timestamp).\
               group_by(User.id_)               

您可以阅读有关SQLAlchemy here

中的泛型函数的更多信息。

答案 1 :(得分:-1)

Matteo解决方案的另一种选择是使用子查询并将其连接到表,该表以我首选的sqlalchemy.orm.query.Query对象格式给出结果。感谢Matteo提供子查询的代码:

subq = db_session.query(User.id_, func.max(User.timestamp).label("maxtimestamp")).filter(User.timestamp < timestamp).group_by(User.id_).subquery()
q = db_session.query(User).join(subq, and_(User.id_ == subq.c.id, User.timestamp == subq.c.maxtimestamp))

SQL生成

请注意,这可能比Matteo的解决方案效率低:

子查询解决方案生成的SQL

SELECT users.id AS users_id, users.timestamp AS users_timestamp, users.name AS users_name, users.notes AS users_notes, users.active AS users_active
FROM users JOIN (SELECT users.id AS id, max(users.timestamp) AS maxtimestamp
FROM users
WHERE users.timestamp < ? GROUP BY users.id) AS anon_1 ON users.id = anon_1.id AND users.timestamp = anon_1.maxtimestamp

Matteo解决方案生成的SQL:

SELECT users.id AS users_id, users.timestamp AS users_timestamp, users.name AS users_name, users.notes AS users_notes, users.active AS users_active, max(users.timestamp) AS max_1
FROM users
WHERE users.timestamp <= ? GROUP BY users.id

此答案的先前内容

@Matteo Di Napoli

谢谢,您的帖子或多或少是我所需要的。其输出是一个sqlalchemy.util._collections.result,从我所看到的来看,它的行为就像一个元组。在我的应用程序中,我需要完整的User对象,而不仅仅是id / timestamp对,因此更适合我的是:

from sqlalchemy import func 

all_versions = db_session.query(User, func.max(User.timestamp)).\
               filter(User.timestamp <= timestamp).\
               group_by(User.id_)

返回类似的内容

> for i in all_versions: print(i)
...
(<User "my test user v2", id 0, modified 2019-06-19 14:42:16.380381>, datetime.datetime(2019, 6, 19, 14, 42, 16, 380381))
(<User "v2", id 1, modified 2019-06-19 15:53:53.147039>, datetime.datetime(2019, 6, 19, 15, 53, 53, 147039))
(<User "a user", id 2, modified 2019-06-20 12:34:56>, datetime.datetime(2019, 6, 20, 12, 34, 56))

然后我可以使用all_versions[n][0]访问User对象或使用l = [i[0] for i in all_versions]获取列表(感谢Matteo Di Napoli提供了更好的语法)。

理想的最终结果是,如果我得到的结果仍然是sqlalchemy.orm.query.Query(如all_versions),但每个项目都是一个User对象而不是一个{{1 }}。有可能吗?