Django QuerySet update_or_create创建重复条目

时间:2018-03-07 12:40:45

标签: django django-queryset

最近我在update_or_create方法中面临问题。让我先说一下完整的解释。

型号:

class TransactionPageVisits(models.Model):
    transactionid = models.ForeignKey(
        Transaction,
        on_delete=models.CASCADE,
        db_column='transactionid',
    )
    sessionid = models.CharField(max_length=40, db_index=True)
    ip_address = models.CharField(max_length=39, editable=False)
    user_agent = models.TextField(null=True, editable=False)
    page = models.CharField(max_length=100, null=True, db_index=True)
    method = models.CharField(max_length=20, null=True)
    url = models.TextField(null=False, editable=False)
    created_dtm = models.DateTimeField(auto_now_add=True)

    class Meta(object):
        ordering = ('created_dtm',)

功能:

def _tracking(self, request, response, **kwargs):
    txn_details = kwargs.get('txn_details')
    data = {
        'sessionid': request.session.session_key,
        'ip_address': get_ip_address(request),
        'user_agent': get_user_agent(request),
        'method': request.method,
        'url': request.build_absolute_uri(),
        'transactionid': txn_details.txn_object,
        'page': kwargs.get('page')
    }

    # Keep updating/creating tracking data to model
    obj, created = TransactionPageVisits.objects.update_or_create(**data)

注意:

我知道我没有将任何默认参数传递给update_or_create(),因为在编写代码时它不是必需的(只想在所有的时候创建一个新行)根据数据的列是统一唯一的。 _tracking()也在中间件中,并将在每个请求和响应中调用。

一切顺利,直到今天我遇到了异常:

File "trackit.py", line 65, in _tracking
    obj, created = TransactionPageVisits.objects.update_or_create(**data)
  File "/usr/local/lib/python2.7/dist-packages/Django-1.10.4-py2.7.egg/django/db/models/manager.py", line 85, in manager_method
    return getattr(self.get_queryset(), name)(*args, **kwargs)
  File "/usr/local/lib/python2.7/dist-packages/Django-1.10.4-py2.7.egg/django/db/models/query.py", line 488, in update_or_create
    obj = self.get(**lookup)
  File "/usr/local/lib/python2.7/dist-packages/Django-1.10.4-py2.7.egg/django/db/models/query.py", line 389, in get
    (self.model._meta.object_name, num)
MultipleObjectsReturned: get() returned more than one TransactionPageVisits -- it returned 2!

我注意到表中创建了两个具有完全相同值的条目(除了created_dtm,因为它具有auto_add_now = True):

| id    | sessionid                        | ip_address     | user_agent                                                                     | page | method | url                                                                                                    | created_dtm                | transactionid |
| 32858 | nrq2vwxbtsjp8yoibotpsur0zit5jhoq | xx.xxx.xxx.xxx | Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:58.0) Gecko/20100101 Firefox/58.0 |      | GET    | https://www.example.com/example_url/?jobid=5a9f2acb4cedfd00011c7d5d&transactionid=XXXXXXXXXXXX | 2018-03-06 23:57:00.061280 | XXXXXXXXXXXX  |
| 32859 | nrq2vwxbtsjp8yoibotpsur0zit5jhoq | xx.xxx.xxx.xxx | Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:58.0) Gecko/20100101 Firefox/58.0 |      | GET    | https://www.example.com/example_url/?jobid=5a9f2acb4cedfd00011c7d5d&transactionid=XXXXXXXXXXXX | 2018-03-06 23:57:00.062121 | XXXXXXXXXXXX  |

为什么首先在表格中创建重复条目?

3 个答案:

答案 0 :(得分:5)

update_or_create容易出现竞争状况,如documentation

中所述
  

如上所述在get_or_create()中,此方法很容易出现   竞争条件,可能导致插入多行   同时,如果未在数据库级别强制实施唯一性。

您可以在模型中使用unique_together,如另一个答案所示。我从未测试过这个,但显然是Django catches the IntegrityError caused by these race conditions

答案 1 :(得分:3)

我无法完全诊断此问题,因为没有默认设置,它仍然有机会意外行动(在我看来)。不过,我建议可能会考虑unique_together来强制执行数据库中的唯一性,这可能会强制将来字段的唯一性。

答案 2 :(得分:0)

源代码继续查找具有给定参数的唯一元素。

def get(self, *args, **kwargs):
    """
    Perform the query and return a single object matching the given
    keyword arguments.
    """
    clone = self.filter(*args, **kwargs)
    if self.query.can_filter() and not self.query.distinct_fields:
        clone = clone.order_by()
    num = len(clone)
    if num == 1:
        return clone._result_cache[0]
    if not num:
        raise self.model.DoesNotExist(
            "%s matching query does not exist." %
            self.model._meta.object_name
        )
    raise self.model.MultipleObjectsReturned(
        "get() returned more than one %s -- it returned %s!" %
        (self.model._meta.object_name, num)
    )

您给出的参数肯定与两个对象类似。

data = {
    'sessionid': request.session.session_key,
    'ip_address': get_ip_address(request),
    'user_agent': get_user_agent(request),
    'method': request.method,
    'url': request.build_absolute_uri(),
    'transactionid': txn_details.txn_object,
    'page': kwargs.get('page')
}

你想更新它们吗?你想保持条目的独特性吗?