在Django / Algorithm中复制模型实例及其相关对象,以便重新复制对象

时间:2009-01-12 21:58:56

标签: python django django-models duplicates

我有BooksChaptersPages的模型。它们都是由User

编写的
from django.db import models

class Book(models.Model)
    author = models.ForeignKey('auth.User')

class Chapter(models.Model)
    author = models.ForeignKey('auth.User')
    book = models.ForeignKey(Book)

class Page(models.Model)
    author = models.ForeignKey('auth.User')
    book = models.ForeignKey(Book)
    chapter = models.ForeignKey(Chapter)

我想要做的是复制现有的Book并将其User更新给其他人。皱纹是我还要将所有相关的模型实例复制到Book - 所有它都是ChaptersPages

在查看Page时,事情变得非常棘手 - 新的Pages不仅需要更新author字段,而且还需要指向新Chapter字段{{1}} 1}}对象!

Django是否支持开箱即用的方式?复制模型的通用算法会是什么样的?

干杯,

约翰


更新

上面给出的类只是举例说明我遇到的问题!

17 个答案:

答案 0 :(得分:16)

随着CollectedObjects被删除,这在Django 1.3中不再有效。见changeset 14507

I posted my solution on Django Snippets.它主要基于用于删除对象的django.db.models.query.CollectedObject代码:

from django.db.models.query import CollectedObjects
from django.db.models.fields.related import ForeignKey

def duplicate(obj, value, field):
    """
    Duplicate all related objects of `obj` setting
    `field` to `value`. If one of the duplicate
    objects has an FK to another duplicate object
    update that as well. Return the duplicate copy
    of `obj`.  
    """
    collected_objs = CollectedObjects()
    obj._collect_sub_objects(collected_objs)
    related_models = collected_objs.keys()
    root_obj = None
    # Traverse the related models in reverse deletion order.    
    for model in reversed(related_models):
        # Find all FKs on `model` that point to a `related_model`.
        fks = []
        for f in model._meta.fields:
            if isinstance(f, ForeignKey) and f.rel.to in related_models:
                fks.append(f)
        # Replace each `sub_obj` with a duplicate.
        sub_obj = collected_objs[model]
        for pk_val, obj in sub_obj.iteritems():
            for fk in fks:
                fk_value = getattr(obj, "%s_id" % fk.name)
                # If this FK has been duplicated then point to the duplicate.
                if fk_value in collected_objs[fk.rel.to]:
                    dupe_obj = collected_objs[fk.rel.to][fk_value]
                    setattr(obj, fk.name, dupe_obj)
            # Duplicate the object and save it.
            obj.id = None
            setattr(obj, field, value)
            obj.save()
            if root_obj is None:
                root_obj = obj
    return root_obj

答案 1 :(得分:9)

这是一种复制对象的简便方法。

基本上:

(1)将原始对象的id设置为None:

book_to_copy.id =无

(2)更改'author'属性并保存ojbect:

book_to_copy.author = new_author

book_to_copy.save()

(3)执行INSERT而不是UPDATE

(它没有解决在页面中更改作者的问题 - 我同意有关重组模型的评论)

答案 2 :(得分:8)

我没有在django中尝试过,但python的deepcopy可能对你有用

修改

如果您实现了功能,则可以为模型定义自定义复制行为:

__copy__() and __deepcopy__()

答案 3 :(得分:7)

这是http://www.djangosnippets.org/snippets/1282/

的修改

它现在与收集器兼容,后者取代了1.3中的CollectedObjects。

我没有真正测试过这么多,但确实测试了一个带有大约20,000个子对象的对象,但只有大约三层的外键深度。当然使用风险自负。

对于读这篇文章的雄心勃勃的人,你应该考虑将Collector子类化(或者将整个类复制以删除对django API的这个未发布部分的这种依赖性)到类似“DuplicateCollector”的类并编写.duplicate与.delete方法类似的方法。这将以真实的方式解决这个问题。

from django.db.models.deletion import Collector
from django.db.models.fields.related import ForeignKey

def duplicate(obj, value=None, field=None, duplicate_order=None):
    """
    Duplicate all related objects of obj setting
    field to value. If one of the duplicate
    objects has an FK to another duplicate object
    update that as well. Return the duplicate copy
    of obj.
    duplicate_order is a list of models which specify how
    the duplicate objects are saved. For complex objects
    this can matter. Check to save if objects are being
    saved correctly and if not just pass in related objects
    in the order that they should be saved.
    """
    collector = Collector({})
    collector.collect([obj])
    collector.sort()
    related_models = collector.data.keys()
    data_snapshot =  {}
    for key in collector.data.keys():
        data_snapshot.update({ key: dict(zip([item.pk for item in collector.data[key]], [item for item in collector.data[key]])) })
    root_obj = None

    # Sometimes it's good enough just to save in reverse deletion order.
    if duplicate_order is None:
        duplicate_order = reversed(related_models)

    for model in duplicate_order:
        # Find all FKs on model that point to a related_model.
        fks = []
        for f in model._meta.fields:
            if isinstance(f, ForeignKey) and f.rel.to in related_models:
                fks.append(f)
        # Replace each `sub_obj` with a duplicate.
        if model not in collector.data:
            continue
        sub_objects = collector.data[model]
        for obj in sub_objects:
            for fk in fks:
                fk_value = getattr(obj, "%s_id" % fk.name)
                # If this FK has been duplicated then point to the duplicate.
                fk_rel_to = data_snapshot[fk.rel.to]
                if fk_value in fk_rel_to:
                    dupe_obj = fk_rel_to[fk_value]
                    setattr(obj, fk.name, dupe_obj)
            # Duplicate the object and save it.
            obj.id = None
            if field is not None:
                setattr(obj, field, value)
            obj.save()
            if root_obj is None:
                root_obj = obj
    return root_obj

编辑:删除了调试“打印”声明。

答案 4 :(得分:4)

在Django 1.5中,这对我有用:

thing.id = None
thing.pk = None
thing.save()

答案 5 :(得分:4)

使用上面的CollectedObjects片段不再有效,但可以通过以下修改完成:

from django.contrib.admin.util import NestedObjects
from django.db import DEFAULT_DB_ALIAS

collector = NestedObjects(using=DEFAULT_DB_ALIAS)

而不是CollectorObjects

答案 6 :(得分:3)

如果您正在构建的数据库中只有几个副本,我发现您只需使用管理界面中的后退按钮,更改必要的字段并再次保存实例。这对我来说很有用,例如,我需要制作一个“手镯”和“伏特加手镯”鸡尾酒,唯一的区别就是取代名称和成分。显然,这需要对数据有一点预见,并不像重写django的复制/深度复制一样强大 - 但它可能会对某些人有所帮助。

答案 7 :(得分:3)

Django确实有一种通过管理员复制对象的内置方法 - 如下所示: In the Django admin interface, is there a way to duplicate an item?

答案 8 :(得分:2)

简单非通用方式

建议的解决方案对我不起作用,所以我采用了简单而不聪明的方式。这仅适用于简单的情况。

对于具有以下结构的模型

Book
 |__ CroppedFace
 |__ Photo
      |__ AwsReco
            |__ AwsLabel
            |__ AwsFace
                  |__ AwsEmotion

这是有效的

def duplicate_book(book: Book, new_user: MyUser):
    # AwsEmotion, AwsFace, AwsLabel, AwsReco, Photo, CroppedFace, Book

    old_cropped_faces = book.croppedface_set.all()
    old_photos = book.photo_set.all()

    book.pk = None
    book.user = new_user
    book.save()

    for cf in old_cropped_faces:
        cf.pk = None
        cf.book = book
        cf.save()

    for photo in old_photos:
        photo.pk = None
        photo.book = book
        photo.save()

        if hasattr(photo, 'awsreco'):
            reco = photo.awsreco
            old_aws_labels = reco.awslabel_set.all()
            old_aws_faces = reco.awsface_set.all()
            reco.pk = None
            reco.photo = photo
            reco.save()

            for label in old_aws_labels:
                label.pk = None
                label.reco = reco
                label.save()

            for face in old_aws_faces:
                old_aws_emotions = face.awsemotion_set.all()
                face.pk = None
                face.reco = reco
                face.save()

                for emotion in old_aws_emotions:
                    emotion.pk = None
                    emotion.aws_face = face
                    emotion.save()
    return book

答案 9 :(得分:1)

我认为你也会对更简单的数据模型更开心。

页面在某个章节中是否真的是一本不同的书?

userMe = User( username="me" )
userYou= User( username="you" )
bookMyA = Book( userMe )
bookYourB = Book( userYou )

chapterA1 = Chapter( book= bookMyA, author=userYou ) # "me" owns the Book, "you" owns the chapter?

chapterB2 = Chapter( book= bookYourB, author=userMe ) # "you" owns the book, "me" owns the chapter?

page1 = Page( book= bookMyA, chapter= chapterB2, author=userMe ) # Book and Author aggree, chapter doesn't?

看起来你的模型太复杂了。

我认为你会更喜欢更简单的事情。我只是在猜这个,因为我不知道整个问题。

class Book(models.Model)
    name = models.CharField(...)

class Chapter(models.Model)
    name = models.CharField(...)
    book = models.ForeignKey(Book)

class Page(models.Model)
    author = models.ForeignKey('auth.User')
    chapter = models.ForeignKey(Chapter)

每个页面都有不同的作者身份。然后,每一章都有一组作者,书中也是如此。现在,您可以复制书籍,章节和页面,将克隆的页面分配给新作者。

实际上,您可能希望在Page和Chapter之间建立多对多的关系,允许您只拥有Page的多个副本,而无需克隆书籍和章节。

答案 10 :(得分:1)

有一个选项可以在django admin中创建重复项/克隆项/另存为新项。

  1. 为您要在admin.py中克隆的模型创建一个ModelAdmin类
  2. 在该类中添加一个管理操作,例如:
 @admin.register(Book)
 class BookAdmin(models.ModelAdmin):
     save_as = True

这将在管理面板中创建一个“另存为新”按钮,以完全克隆模型对象及其所有相关字段。

答案 11 :(得分:1)

我对 Django 2.1.2 的所有答案都不满意,因此我很大程度上基于上述答案创建了一种通用的方式来执行数据库模型的深层副本

与上述答案的主要区别在于ForeignKey不再具有称为rel的属性,因此必须将其更改为f.remote_field.model等。

此外,由于难以知道应复制数据库模型的顺序,因此,我创建了一个简单的排队系统,如果未成功复制该模型,则会将当前模型推到列表的末尾。代码如下:

import queue
from django.contrib.admin.utils import NestedObjects
from django.db.models.fields.related import ForeignKey

def duplicate(obj, field=None, value=None, max_retries=5):
    # Use the Nested Objects collector to retrieve the related models
    collector = NestedObjects(using='default')
    collector.collect([obj])
    related_models = list(collector.data.keys())

    # Create an object to map old primary keys to new ones
    data_snapshot = {}
    model_queue = queue.Queue()
    for key in related_models:
        data_snapshot.update(
            {key: {item.pk: None for item in collector.data[key]}}
        )
        model_queue.put(key)

    # For each of the models in related models copy their instances
    root_obj = None
    attempt_count = 0
    while not model_queue.empty():
        model = model_queue.get()
        root_obj, success = copy_instances(model, related_models, collector, data_snapshot, root_obj)

        # If the copy is not a success, it probably means that not
        # all the related fields for the model has been copied yet.
        # The current model is therefore pushed to the end of the list to be copied last
        if not success:

            # If the last model is unsuccessful or the number of max retries is reached, raise an error
            if model_queue.empty() or attempt_count > max_retries:
                raise DuplicationError(model)
            model_queue.put(model)
            attempt_count += 1
    return root_obj

def copy_instances(model, related_models, collector, data_snapshot, root_obj):

# Store all foreign keys for the model in a list
fks = []
for f in model._meta.fields:
    if isinstance(f, ForeignKey) and f.remote_field.model in related_models:
        fks.append(f)

# Iterate over the instances of the model
for obj in collector.data[model]:

    # For each of the models foreign keys check if the related object has been copied
    # and if so, assign its personal key to the current objects related field
    for fk in fks:
        pk_field = f"{fk.name}_id"
        fk_value = getattr(obj, pk_field)

        # Fetch the dictionary containing the old ids
        fk_rel_to = data_snapshot[fk.remote_field.model]

        # If the value exists and is in the dictionary assign it to the object
        if fk_value is not None and fk_value in fk_rel_to:
            dupe_pk = fk_rel_to[fk_value]

            # If the desired pk is none it means that the related object has not been copied yet
            # so the function returns unsuccessful
            if dupe_pk is None:
                return root_obj, False

            setattr(obj, pk_field, dupe_pk)

    # Store the old pk and save the object without an id to create a shallow copy of the object
    old_pk = obj.id
    obj.id = None

    if field is not None:
        setattr(obj, field, value)

    obj.save()

    # Store the new id in the data snapshot object for potential use on later objects
    data_snapshot[model][old_pk] = obj.id

    if root_obj is None:
        root_obj = obj

return root_obj, True

我希望对您有帮助:)

复制错误只是一个简单的异常扩展:

class DuplicationError(Exception):
    """
    Is raised when a duplication operation did not succeed

    Attributes:
        model -- The database model that failed
    """

    def __init__(self, model):
        self.error_model = model

    def __str__(self):
        return f'Was not able to duplicate database objects for model {self.error_model}'

答案 12 :(得分:1)

我尝试了Django 2.2 / Python 3.6中的一些答案,但是它们似乎并没有复制一对多和多对多相关对象。此外,许多工具还包括硬编码/合并了数据结构的知识。

我写了一种更通用的方法来处理一对多和多对多相关对象。包含评论,如果您有建议,我希望对此加以改进:

def duplicate_object(self):
    """
    Duplicate a model instance, making copies of all foreign keys pointing to it.
    There are 3 steps that need to occur in order:

        1.  Enumerate the related child objects and m2m relations, saving in lists/dicts
        2.  Copy the parent object per django docs (doesn't copy relations)
        3a. Copy the child objects, relating to the copied parent object
        3b. Re-create the m2m relations on the copied parent object

    """
    related_objects_to_copy = []
    relations_to_set = {}
    # Iterate through all the fields in the parent object looking for related fields
    for field in self._meta.get_fields():
        if field.one_to_many:
            # One to many fields are backward relationships where many child 
            # objects are related to the parent. Enumerate them and save a list 
            # so we can copy them after duplicating our parent object.
            print(f'Found a one-to-many field: {field.name}')

            # 'field' is a ManyToOneRel which is not iterable, we need to get
            # the object attribute itself.
            related_object_manager = getattr(self, field.name)
            related_objects = list(related_object_manager.all())
            if related_objects:
                print(f' - {len(related_objects)} related objects to copy')
                related_objects_to_copy += related_objects

        elif field.many_to_one:
            # In testing, these relationships are preserved when the parent
            # object is copied, so they don't need to be copied separately.
            print(f'Found a many-to-one field: {field.name}')

        elif field.many_to_many:
            # Many to many fields are relationships where many parent objects
            # can be related to many child objects. Because of this the child
            # objects don't need to be copied when we copy the parent, we just
            # need to re-create the relationship to them on the copied parent.
            print(f'Found a many-to-many field: {field.name}')
            related_object_manager = getattr(self, field.name)
            relations = list(related_object_manager.all())
            if relations:
                print(f' - {len(relations)} relations to set')
                relations_to_set[field.name] = relations

    # Duplicate the parent object
    self.pk = None
    self.save()
    print(f'Copied parent object ({str(self)})')

    # Copy the one-to-many child objects and relate them to the copied parent
    for related_object in related_objects_to_copy:
        # Iterate through the fields in the related object to find the one that 
        # relates to the parent model.
        for related_object_field in related_object._meta.fields:
            if related_object_field.related_model == self.__class__:
                # If the related_model on this field matches the parent
                # object's class, perform the copy of the child object and set
                # this field to the parent object, creating the new
                # child -> parent relationship.
                related_object.pk = None
                setattr(related_object, related_object_field.name, self)
                related_object.save()

                text = str(related_object)
                text = (text[:40] + '..') if len(text) > 40 else text
                print(f'|- Copied child object ({text})')

    # Set the many-to-many relations on the copied parent
    for field_name, relations in relations_to_set.items():
        # Get the field by name and set the relations, creating the new
        # relationships.
        field = getattr(self, field_name)
        field.set(relations)
        text_relations = []
        for relation in relations:
            text_relations.append(str(relation))
        print(f'|- Set {len(relations)} many-to-many relations on {field_name} {text_relations}')

    return self

答案 13 :(得分:0)

这是一个思路简单的解决方案。这不依赖于任何未公开的Django API。它假定您要复制单个父记录及其子记录,孙记录等。您传入实际上应该复制的类的白名单,形式为list,指向每个父对象上指向其子对象的一对多关系的名称。此代码假定,根据上述白名单,整个树都是独立的,无需担心外部引用。

对于上面的author字段,此解决方案没有做任何特殊的事情。我不确定是否可以使用。就像其他人所说的那样,author字段可能不应在不同的模型类中重复。

关于此代码的另一件事:它是真正的递归,因为它为每个新的子孙级别调用自己。

from collections import OrderedDict

def duplicate_model_with_descendants(obj, whitelist, _new_parent_pk=None):
    kwargs = {}
    children_to_clone = OrderedDict()
    for field in obj._meta.get_fields():
        if field.name == "id":
            pass
        elif field.one_to_many:
            if field.name in whitelist:
                these_children = list(getattr(obj, field.name).all())
                if children_to_clone.has_key(field.name):
                    children_to_clone[field.name] |= these_children
                else:
                    children_to_clone[field.name] = these_children
            else:
                pass
        elif field.many_to_one:
            if _new_parent_pk:
                kwargs[field.name + '_id'] = _new_parent_pk
        elif field.concrete:
            kwargs[field.name] = getattr(obj, field.name)
        else:
            pass
    new_instance = obj.__class__(**kwargs)
    new_instance.save()
    new_instance_pk = new_instance.pk
    for ky in children_to_clone.keys():
        child_collection = getattr(new_instance, ky)
        for child in children_to_clone[ky]:
            child_collection.add(duplicate_model_with_descendants(child, whitelist=whitelist, _new_parent_pk=new_instance_pk))
    return new_instance

用法示例:

from django.db import models

class Book(models.Model)
    author = models.ForeignKey('auth.User')

class Chapter(models.Model)
    # author = models.ForeignKey('auth.User')
    book = models.ForeignKey(Book, related_name='chapters')

class Page(models.Model)
    # author = models.ForeignKey('auth.User')
    # book = models.ForeignKey(Book)
    chapter = models.ForeignKey(Chapter, related_name='pages')

WHITELIST = ['books', 'chapters', 'pages']
original_record = models.Book.objects.get(pk=1)
duplicate_record = duplicate_model_with_descendants(original_record, WHITELIST)

答案 14 :(得分:0)

我尝试了Stephen G Tuggy的解决方案,但发现它非常聪明,但不幸的是,它在某些特殊情况下不起作用。

让我们假设以下情况:

class FattAqp(models.Model):    
    descr = models.CharField('descrizione', max_length=200)
    ef = models.ForeignKey(Esercizio, ...)
    forn = models.ForeignKey(Fornitore, ...)

class Periodo(models.Model):
    #  id usato per identificare i documenti
    # periodo rilevato in fattura
    data_i_p = models.DateField('data inizio', blank=True)
    idfatt = models.ForeignKey(FattAqp, related_name='periodo')

class Lettura(models.Model):
    mc_i = models.DecimalField(max_digits=7, ...)
    faqp = models.ForeignKey(FattAqp, related_name='lettura')
    an_im = models.ForeignKey('cnd.AnagImm', ..)

class DettFAqp(models.Model):
    imponibile = models.DecimalField(...)
    voce = models.ForeignKey(VoceAqp, ...)
    periodo = models.ForeignKey(Periodo, related_name='dettfaqp')

在这种情况下,如果我们尝试深度复制FattAqp实例,则将无法正确设置ef,forn,an_im和voce字段;另一方面是idfatt,faqp,periodo。

我通过向函数添加一个参数并稍加修改来解决了该问题。我用Python 3.6和Django 2.2进行了测试 就是这样:

def duplicate_model_with_descendants(obj, whitelist, _new_parent_pk=None, static_fk=None):
    kwargs = {}
    children_to_clone = OrderedDict()
    for field in obj._meta.get_fields():
        if field.name == "id":
            pass
        elif field.one_to_many:
            if field.name in whitelist:
                these_children = list(getattr(obj, field.name).all())

                if field.name in children_to_clone:
                    children_to_clone[field.name] |= these_children
                else:
                    children_to_clone[field.name] = these_children
            else:
                pass
        elif field.many_to_one:
            name_with_id = field.name + '_id'
            if _new_parent_pk:
                kwargs[name_with_id] = _new_parent_pk

            if name_with_id in static_fk:
                kwargs[name_with_id] = getattr(obj, name_with_id)

        elif field.concrete:
            kwargs[field.name] = getattr(obj, field.name)
        else:
            pass
    new_instance = obj.__class__(**kwargs)
    new_instance.save()
    new_instance_pk = new_instance.pk
    for ky in children_to_clone.keys():
        child_collection = getattr(new_instance, ky)
        for child in children_to_clone[ky]:
            child_collection.add(
                duplicate_model_with_descendants(child, whitelist=whitelist, _new_parent_pk=new_instance_pk,static_fk=static_fk))

用法示例:

original_record = FattAqp.objects.get(pk=4)
WHITELIST = ['lettura', 'periodo', 'dettfaqp']
STATIC_FK = ['fornitore_id','ef_id','an_im_id', 'voce_id']
duplicate_record = duplicate_model_with_descendants(original_record, WHITELIST, static_fk=STATIC_FK)

答案 15 :(得分:0)

根据先前的答案进行详细说明:

import win32com
o = win32com.client.Dispatch("Excel.Application")
o.Visible = False
o.DisplayAlerts = False
wb = o.Workbooks.Open(r"C:/Users//Documents/test.xlsx")
wb.WorkSheets([1,2,3,4,5]).Select()
pathpdf = r'C:/Users/Documents/test.pdf'
wb.ActiveSheet.ExportAsFixedFormat(0,pathpdf)

答案 16 :(得分:0)

胡里奥·马林斯的建议工作!谢谢!

对于Django> = 2. *此行:

if isinstance(f, ForeignKey) and f.rel.to in related_models:

应替换为:

if isinstance(f, ForeignKey) and f.remote_field.model in related_models: