SQLAlchemy辅助连接模型在奇怪的条件下失败

时间:2016-09-02 18:37:25

标签: python-2.7 model sqlalchemy

我有一个奇怪的问题,我根本无法解决。从本质上讲,我有一个完美的模型和系统 - 除非在非常具体(看似随意)的情况下。

我会在一秒钟内粘贴模型,但这就是想法。我希望某些表格被版本化。这意味着对于给定的表,我将其分成两个表,Master部分具有对象的自然键,Version表具有可能改变的所有相关数据。然后我的一些模型当然有关系,所以我创建了一个链接版本的连接表。

以下是模型:

class Versioned(object):

    def __init__(self, **kwargs):

        super(Versioned, self).__init__(**kwargs)

        self.active = True
        self.created_on = datetime.datetime.now()

    active = Column(BOOLEAN)
    created_on = Column(TIMESTAMP, server_default=func.now())

    def __eq__(self, other):

        return self.__class__ == other.__class__ and \
            all([getattr(self, key) == getattr(other, key)
                for key in self.comparison_keys
                ])

    def __ne__(self, other):

        return not self.__eq__(other)

    comparison_keys = []

class Parent(Base):

    __tablename__ = 'parent'

    id = Column(INTEGER, primary_key=True)

    name = Column(TEXT)

    versions = relationship("ParentVersion", back_populates="master")

    children = relationship("Child", back_populates="parent")

    @property
    def current_version(self):
        active_versions = [v for v in self.versions if v.active==True]

        return active_versions[0] if active_versions else None

class ParentVersion(Versioned, Base):

    __tablename__ = 'parent_version'

    id = Column(INTEGER, primary_key=True)

    master_id = Column(INTEGER, ForeignKey(Parent.id))

    address = Column(TEXT)

    master = relationship("Parent", back_populates="versions")

    children = relationship("ChildVersion",
        secondary=lambda : Parent_Child.__table__
    )

class Child(Base):

    __tablename__ = 'child'

    id = Column(INTEGER, primary_key=True)

    parent_id = Column(INTEGER, ForeignKey(Parent.id))

    name = Column(TEXT)

    versions = relationship("ChildVersion", back_populates="master")

    parent = relationship("Parent", back_populates="children")

    @property
    def current_version(self):
        active_versions = [v for v in self.versions if v.active==True]

        return active_versions[0] if active_versions else None


class ChildVersion(Versioned, Base):

    __tablename__ = 'child_version'

    id = Column(INTEGER, primary_key=True)

    master_id = Column(INTEGER, ForeignKey(Child.id))

    age = Column(INTEGER)

    fav_toy = Column(TEXT)

    master = relationship("Child", back_populates="versions")

    parents = relationship("ParentVersion",
        secondary=lambda: Parent_Child.__table__,
    )

    comparison_keys = [
        'age',
        'fav_toy',
    ]

class Parent_Child(Base):

    __tablename__ = 'parent_child'

    id = Column(INTEGER, primary_key=True)

    parent_id = Column(INTEGER, ForeignKey(ParentVersion.id))
    child_id = Column(INTEGER, ForeignKey(ChildVersion.id))

好的,所以我知道最近的SQLAlchemy模型对版本控制有一些想法,我可能会以错误的方式做这件事。但这很适合我的用例。所以幽默我,让我们假设模型是可以的(在一般意义上 - 如果有一个小细节导致错误修复的错误)

现在假设我要插入数据。我有来自某些来源的数据,我接受并构建模型。即,将事物分成主/版本,分配子关系,分配版本关系。现在我想将它与我数据库中已有的数据进行比较。对于每个主对象,如果我找到它,我会比较版本。如果版本不同,则创建新版本。棘手的部分变成,如果Child版本不同,我想插入一个新的Parent版本,并更新其所有关系。也许代码更有意义来解释这一部分。 search_parent是我在预解析阶段创建的对象。它有一个版本和子对象,它们也有版本。

parent_conds = [
    getattr(search_parent.__class__, name) == getattr(search_parent, name)
    for name, column in search_parent.__class__.__mapper__.columns.items()
    if not column.primary_key
]

parent_match = session.query(Parent).filter(*parent_conds).first()

# We are going to make a new version
parent_match.current_version.active=False
parent_match.versions.append(search_parent.current_version)

for search_child in search_parent.children[:]:

    search_child.parent_id = parent_match.id

    search_conds = [
        getattr(search_child.__class__, name) == getattr(search_child, name)
        for name, column in search_child.__class__.__mapper__.columns.items()
        if not column.primary_key
    ]

    child_match = session.query(Child).filter(*search_conds).first()

    if child_match.current_version != search_child.current_version:
        # create a new version: deactivate the old one, insert the new
        child_match.current_version.active=False
        child_match.versions.append(search_child.current_version)

    else:
        # copy the old version to point to the new parent version
        children = parent_match.current_version.children

        children.append(child_match.current_version)
        children.remove(search_child.current_version)
        session.expunge(search_child.current_version)

    session.expunge(search_child)

session.expunge(search_parent)

session.add(parent_match)

session.commit()

好的,再一次,这可能不是完美的甚至是最好的方法。但它确实有效。除了,这是我无法弄清楚的。如果我将子的age属性更新为整数值零,则不起作用。如果子对象从0岁开始,并且我将其更改为其他东西,则这可以很好地工作。如果我从一些非零整数开始,并将年龄更新为0,我会收到此警告:

SAWarning: Object of type <ChildVersion> not in session, add operation   along 'ParentVersion.children' won't proceed (mapperutil.state_class_str(child), operation, self.prop))

插入更新版本,但不会发生插入parent_child连接表的插入。并不是它失败了,而是SQLAlchemy确定子对象不存在并且无法创建连接。但它确实存在,我知道它会被插入。

同样,只有在我插入年龄= 0的新版本时才会发生这种情况。如果我正在插入任何其他年龄的新版本,这完全符合我的要求。

关于这个bug还有其他一些奇怪的事情 - 如果你没有插入足够多的孩子(似乎大约12个触发了这个bug)就不会发生这种情况,有时根据其他属性不会发生这种情况。我不认为我完全理解导致它的表面区域。

感谢您花时间阅读这篇文章。我有一个完整的工作演示完整的源数据,我很乐意分享,它只需要一些设置,所以我不知道这篇文章是否合适。我希望有人对于要看什么有想法,因为此时我完全不在了。

编辑:这是导致警告的完整堆栈跟踪。

  File "repro.py", line 313, in <module>
  load_data(session, second_run)
File "repro.py", line 293, in load_data
  session.commit()
File "/Users/me/virtualenvs/dev/lib/python2.7/site-packages/sqlalchemy/orm/session.py", line 801, in commit
  self.transaction.commit()
File "/Users/me/virtualenvs/dev/lib/python2.7/site-packages/sqlalchemy/orm/session.py", line 392, in commit
  self._prepare_impl()
File "/Users/me/virtualenvs/dev/lib/python2.7/site-packages/sqlalchemy/orm/session.py", line 372, in _prepare_impl
  self.session.flush()
File "/Users/me/virtualenvs/dev/lib/python2.7/site-packages/sqlalchemy/orm/session.py", line 2019, in flush
  self._flush(objects)
File "/Users/me/virtualenvs/dev/lib/python2.7/site-packages/sqlalchemy/orm/session.py", line 2101, in _flush
  flush_context.execute()
File "/Users/me/virtualenvs/dev/lib/python2.7/site-packages/sqlalchemy/orm/unitofwork.py", line 373, in execute
  rec.execute(self)
File "/Users/me/virtualenvs/dev/lib/python2.7/site-packages/sqlalchemy/orm/unitofwork.py", line 487, in execute
  self.dependency_processor.process_saves(uow, states)
File "/Users/me/virtualenvs/dev/lib/python2.7/site-packages/sqlalchemy/orm/dependency.py", line 1053, in process_saves
  False, uowcommit, "add"):
File "/Users/me/virtualenvs/dev/lib/python2.7/site-packages/sqlalchemy/orm/dependency.py", line 1154, in _synchronize
  (mapperutil.state_class_str(child), operation, self.prop))
File "/Users/me/virtualenvs/dev/lib/python2.7/site-packages/sqlalchemy/util/langhelpers.py", line 1297, in warn
  warnings.warn(msg, exc.SAWarning, stacklevel=2)
File "repro.py", line 10, in warn_with_traceback
  traceback.print_stack()
/Users/me/virtualenvs/dev/lib/python2.7/site-packages/sqlalchemy/orm/dependency.py:1154: SAWarning: Object of type <ChildVersion> not in session, add operation along 'ParentVersion.children' won't proceed
(mapperutil.state_class_str(child), operation, self.prop))

EDIT2: 这是一个带有python文件的要点,你可以运行它来查看奇怪的行为。 https://gist.github.com/jbouricius/2ede420fb1f7a2deec9f557c76ced7f9

1 个答案:

答案 0 :(得分:1)

您收到此错误的原因是您无意中将对象添加到会话中。

这是MVCE:

engine = create_engine("sqlite://", echo=False)


def get_data():
    children = [
        Child(name="Carol", versions=[ChildVersion(age=0, fav_toy="med")]),
        Child(name="Timmy", versions=[ChildVersion(age=0, fav_toy="med")]),
    ]
    return Parent(
        name="Zane", children=children,
        versions=[
            ParentVersion(
                address="123 Fake St",
                children=[v for child in children for v in child.versions]
            )
        ]
    )


def main():
    Base.metadata.create_all(engine)

    session = Session(engine)
    parent_match = get_data()
    session.add(parent_match)
    session.commit()

    with session.no_autoflush:
        search_parent = get_data()

        parent_match.versions.append(search_parent.current_version)
        for search_child in search_parent.children[:]:
            child_match = next(c for c in parent_match.children if c.name == search_child.name)

            if child_match.current_version != search_child.current_version:
                child_match.versions.append(search_child.current_version)
            else:
                session.expunge(search_child.current_version)

            session.expunge(search_child)

        session.expunge(search_parent)
        session.commit()

除此之外:这是您需要在问题本身中提供的内容。提供tarball指令并不是获得答案的最佳方式。

该行

parent_match.versions.append(search_parent.current_version)

不仅会添加search_parent.current_version,还会添加search_parent,而search_parent.current_version会添加所有相关对象,包括其他子项的子版本。鉴于您稍后会删除其他相关对象以阻止它们被添加到会话中,我得出结论,您只想添加search_parent而不添加其他相关对象。由于您的关系具有循环性质,因此在添加之前,您需要注意仅提取with session.no_autoflush: search_parent = get_data() current_parent_version = search_parent.current_version search_parent.versions.remove(current_parent_version) current_parent_version.children = [] # <--- this is key for search_child in search_parent.children[:]: child_match = next(c for c in parent_match.children if c.name == search_child.name) if child_match.current_version != search_child.current_version: current_child_version = search_child.current_version search_child.versions.remove(current_child_version) child_match.versions.append(current_child_version) current_parent_version.children.append(current_child_version) parent_match.versions.append(current_parent_version) session.commit() 之外的对象。这是固定的MVCE:

<script src="https://ajax.googleapis.com/ajax/libs/jquery/2.1.1/jquery.min.js"></script>