查询多个子句之间的条目

时间:2017-06-21 15:30:28

标签: python postgresql python-3.x sqlalchemy

我的ORM如下所示:

from sqlalchemy import Column, Integer, String, TIMESTAMP, ForeignKey
from sqlalchemy.orm import relationship
from sqlalchemy.ext.declarative import declarative_base

Base = declarative_base()

class Data(Base):
    __tablename__ = 'data'
    id = Column(Integer, primary_key=True)
    value = Column(String(8), nullable=False)
    timestamp = Column(TIMESTAMP, nullable=False)
    object = Column(Integer, ForeignKey('object.id'))

class Object(Base):
    __tablename__ = 'object'
    id = Column(Integer, primary_key=True)
    version = Column(String(8), nullable=False)
    setting = Column(String(8), nullable=False)
    history = relationship('ObjectHistory', backref='history')

class ObjectHistory(Base):
    __tablename__ = 'object_history'
    id = Column(Integer, primary_key=True)
    version = Column(String(8), nullable=False)
    setting = Column(String(8), nullable=False)
    start = Column(TIMESTAMP, nullable=False)
    end = Column(TIMESTAMP)
    object = Column(Integer, ForeignKey('object.id'))

我的数据如下所示:

from sqlalchemy import create_engine
from sqlalchemy.orm.session import sessionmaker
import datetime

engine = create_engine('postgresql://username:password@localhost/')
Session = sessionmaker(bind=engine)
session = Session()

Base.metadata.create_all(engine)

obj = Object(version='0001', setting='some')

# populate database
data = [
    obj,
    Data(value='a', timestamp=datetime.datetime(2017,6,21,12,0,0), object=obj.id),
    Data(value='b', timestamp=datetime.datetime(2017,6,21,13,0,0), object=obj.id),
    Data(value='c', timestamp=datetime.datetime(2017,6,21,14,0,0), object=obj.id),
    Data(value='d', timestamp=datetime.datetime(2017,6,21,15,0,0), object=obj.id),
    ObjectHistory(version='0001', setting='any', start=datetime.datetime(2017,6,21,11,30,0), end=datetime.datetime(2017,6,21,12,30,0)),
    ObjectHistory(version='0002', setting='some', start=datetime.datetime(2017,6,21,12,30,0), end=datetime.datetime(2017,6,21,13,30,0)),
    ObjectHistory(version='0001', setting='some', start=datetime.datetime(2017,6,21,13,30,0), end=None),
]

session.add_all(data)
session.commit()

我想在Data具有特定版本时查询所有Object。如您所见,历史记录中可能会多次出现相同的版本,并且我希望获得使用特定版本的所有数据条目。

我想到了以下几点:

version = '0001'

# get the start and end timestamps during which object had this version
between_these = session.query(ObjectHistory.start, ObjectHistory.end) \
    .filter(ObjectHistory.version == version)

# and then somehow query Data between these timestamps
# so that data contains the Data rows with values 'a', 'c', and 'd'
# this won't work
data = session.query(Data) \
    .filter(Data.timestamp.between(between_these.start, between_these.end)).all()

但是我认为这种方法不起作用,因为可以有多个开始和结束时间戳。我想我需要使用or_http://docs.sqlalchemy.org/en/latest/core/sqlelement.html#sqlalchemy.sql.expression.or_),但我似乎无法弄清楚如何在这种情况下应用它。它是否可能,如果可能,怎么样?

编辑:所以所需的输出是Data行,Data.object版本为'0001',在示例中,Data行为Data.value行是'a','c'和'd'。

2 个答案:

答案 0 :(得分:2)

此处的问题是between_these对象具有Query类型,即它尚未执行且没有属性start / end

我们可以做到以下几点:

  • between_these对象创建子查询,然后在过滤器中使用它,
  • 使用 PostgreSQL COALESCE函数ObjectHistory.end NULL

所以它可以像

between_these = (session.query(ObjectHistory.start,
                               ObjectHistory.end)
                 .filter(ObjectHistory.version == '0001')
                 .subquery('between_these'))

data = (session.query(Data)
        .filter(Data.timestamp.between(between_these.c.start,
                                       func.coalesce(between_these.c.end,
                                                     datetime.max)))
        .all())

这会有效,但我们永远不会知道object_history记录和已过滤的data记录之间的关系。

如果您希望每个已过滤的Data对象都有ObjectHistory对象进行过滤,那么我们可以查询这两个对象

data = (session.query(Data, ObjectHistory)
        .filter(ObjectHistory.version == '0001')
        .filter(Data.timestamp.between(ObjectHistory.start,
                                       func.coalesce(ObjectHistory.end,
                                                     datetime.max))))
        .all())

(我们这里不需要between_these个对象)

或者如果我们想要另外知道时间间隔

data = (session.query(Data, ObjectHistory.start, ObjectHistory.end)
        .filter(ObjectHistory.version == '0001')
        .filter(Data.timestamp.between(ObjectHistory.start,
                                       func.coalesce(ObjectHistory.end,
                                                     datetime.max))))
        .all())

测试

首先导入并向模型添加初始值设定项

from datetime import datetime

from sqlalchemy import Column, Integer, String, TIMESTAMP, create_engine, func
from sqlalchemy.engine.url import make_url
from sqlalchemy.ext.declarative import declarative_base
from sqlalchemy.orm import sessionmaker

Base = declarative_base()


class Data(Base):
    __tablename__ = 'data'
    id = Column(Integer, primary_key=True)
    value = Column(String(8), nullable=False)
    timestamp = Column(TIMESTAMP, nullable=False)

    def __init__(self, value, timestamp):
        self.value = value
        self.timestamp = timestamp


class Object(Base):
    __tablename__ = 'object'
    id = Column(Integer, primary_key=True)
    version = Column(String(8), nullable=False)
    setting = Column(String(8), nullable=False)

    def __init__(self, version, setting):
        self.version = version
        self.setting = setting


class ObjectHistory(Base):
    __tablename__ = 'object_history'
    id = Column(Integer, primary_key=True)
    version = Column(String(8), nullable=False)
    setting = Column(String(8), nullable=False)
    start = Column(TIMESTAMP, nullable=False)
    end = Column(TIMESTAMP)

    def __init__(self, version, setting, start, end):
        self.version = version
        self.setting = setting
        self.start = start
        self.end = end

然后初始化数据库&创建会话

db_uri = make_url('postgresql://username:password@host:5432/database')
engine = create_engine(db_uri)
Base.metadata.create_all(bind=engine)
session_factory = sessionmaker(bind=engine)
session = session_factory()

之后我们将测试数据添加到数据库

session.add_all([
    # first `Data` object
    Data(value='a',
         timestamp=datetime(2017, 6, 21, 12, 0, 0)),
    # second `Data` object
    Data(value='b',
         timestamp=datetime(2017, 6, 21, 13, 0, 0)),
    # third `Data` object
    Data(value='c',
         timestamp=datetime(2017, 6, 21, 14, 0, 0)),
    # fourth `Data` object
    Data(value='d',
         timestamp=datetime(2017, 6, 21, 15, 0, 0)),
    Object(version='0001',
           setting='some'),
    ObjectHistory(version='0001',
                  setting='any',
                  start=datetime(2017, 6, 21, 11, 30, 0),
                  end=datetime(2017, 6, 21, 12, 30, 0)),
    ObjectHistory(version='0002',
                  setting='some',
                  start=datetime(2017, 6, 21, 12, 30, 0),
                  end=datetime(2017, 6, 21, 13, 30, 0)),
    ObjectHistory(version='0001',
                  setting='some',
                  start=datetime(2017, 6, 21, 13, 30, 0),
                  end=None)])
session.commit()

然后生成查询并获取它

between_these = (session.query(ObjectHistory.start,
                               ObjectHistory.end)
                 .filter(ObjectHistory.version == '0001')
                 .subquery('between_these'))

data = (session.query(Data)
        .filter(Data.timestamp.between(between_these.c.start,
                                       func.coalesce(between_these.c.end,
                                                     datetime.max)))
        .all())

最后 - 断言

assert len(data) == 3
assert all(datum.value in {'a', 'c', 'd'}
           for datum in data)

因为我们可以看到data对象由第一,第三和第四Data个对象组成。

答案 1 :(得分:0)

这里的问题是你的模型表之间没有任何明显的关系。

要查询对象何时具有特定版本的数据,必须存在来自Object - >的关系。 ObjectHistory,要查询与Object的特定版本相关联的数据ObjectHistory - >之间必须存在关联。 Data

更改的架构应如下所示:

from sqlalchemy import Column, Integer, String, TIMESTAMP, ForeignKey
from sqlalchemy.ext.declarative import declarative_base

Base = declarative_base()

class Object(Base):
    __tablename__ = 'object'
    id = Column(Integer, primary_key=True)
    version = Column(String(8), nullable=False)
    setting = Column(String(8), nullable=False)

class ObjectHistory(Base):
    __tablename__ = 'object_history'
    id = Column(Integer, primary_key=True)
    object_id = Column(Integer, ForeignKey(Object.id))
    version = Column(String(8), nullable=False)
    setting = Column(String(8), nullable=False)
    start = Column(TIMESTAMP, nullable=False)
    end = Column(TIMESTAMP, nullable=False)

class Data(Base):
    __tablename__ = 'data'
    id = Column(Integer, primary_key=True)
    object_history_id = Column(Integer, ForeignKey(ObjectHistory.id))
    value = Column(String(8), nullable=False)
    timestamp = Column(String(8), nullable=False)

然后您可以编写相应的SELECT查询:

version = '0001'
object_id = 1
stmt = session.query(Object.id,
                     ObjectHistory.version,
                     ObjectHistory.start,
                     ObjectHistory.end,
                     Data.id,
                     Data.value,
                     Data.timestamp) \
    .filter(Object.id == ObjectHistory.object_id) \
    .filter(Data.object_history_id = ObjectHistory.id) \
    .filter(Object.id == object_id)
    .filter(ObjectHistory.version == version)

但是,这只是设置此类数据模型的一种方法。

或者,Object - > Data可以是相关的& Object - >与上述ObjectHistory相关的DataObjectHistory& BETWEEN需要.kv运营商。