最近我在update_or_create方法中面临问题。让我先说一下完整的解释。
型号:
class TransactionPageVisits(models.Model):
transactionid = models.ForeignKey(
Transaction,
on_delete=models.CASCADE,
db_column='transactionid',
)
sessionid = models.CharField(max_length=40, db_index=True)
ip_address = models.CharField(max_length=39, editable=False)
user_agent = models.TextField(null=True, editable=False)
page = models.CharField(max_length=100, null=True, db_index=True)
method = models.CharField(max_length=20, null=True)
url = models.TextField(null=False, editable=False)
created_dtm = models.DateTimeField(auto_now_add=True)
class Meta(object):
ordering = ('created_dtm',)
功能:
def _tracking(self, request, response, **kwargs):
txn_details = kwargs.get('txn_details')
data = {
'sessionid': request.session.session_key,
'ip_address': get_ip_address(request),
'user_agent': get_user_agent(request),
'method': request.method,
'url': request.build_absolute_uri(),
'transactionid': txn_details.txn_object,
'page': kwargs.get('page')
}
# Keep updating/creating tracking data to model
obj, created = TransactionPageVisits.objects.update_or_create(**data)
注意:
我知道我没有将任何默认参数传递给update_or_create(),因为在编写代码时它不是必需的(只想在所有的时候创建一个新行)根据数据的列是统一唯一的。 _tracking()也在中间件中,并将在每个请求和响应中调用。
一切顺利,直到今天我遇到了异常:
File "trackit.py", line 65, in _tracking
obj, created = TransactionPageVisits.objects.update_or_create(**data)
File "/usr/local/lib/python2.7/dist-packages/Django-1.10.4-py2.7.egg/django/db/models/manager.py", line 85, in manager_method
return getattr(self.get_queryset(), name)(*args, **kwargs)
File "/usr/local/lib/python2.7/dist-packages/Django-1.10.4-py2.7.egg/django/db/models/query.py", line 488, in update_or_create
obj = self.get(**lookup)
File "/usr/local/lib/python2.7/dist-packages/Django-1.10.4-py2.7.egg/django/db/models/query.py", line 389, in get
(self.model._meta.object_name, num)
MultipleObjectsReturned: get() returned more than one TransactionPageVisits -- it returned 2!
我注意到表中创建了两个具有完全相同值的条目(除了created_dtm,因为它具有auto_add_now = True):
| id | sessionid | ip_address | user_agent | page | method | url | created_dtm | transactionid |
| 32858 | nrq2vwxbtsjp8yoibotpsur0zit5jhoq | xx.xxx.xxx.xxx | Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:58.0) Gecko/20100101 Firefox/58.0 | | GET | https://www.example.com/example_url/?jobid=5a9f2acb4cedfd00011c7d5d&transactionid=XXXXXXXXXXXX | 2018-03-06 23:57:00.061280 | XXXXXXXXXXXX |
| 32859 | nrq2vwxbtsjp8yoibotpsur0zit5jhoq | xx.xxx.xxx.xxx | Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:58.0) Gecko/20100101 Firefox/58.0 | | GET | https://www.example.com/example_url/?jobid=5a9f2acb4cedfd00011c7d5d&transactionid=XXXXXXXXXXXX | 2018-03-06 23:57:00.062121 | XXXXXXXXXXXX |
为什么首先在表格中创建重复条目?
答案 0 :(得分:5)
update_or_create
容易出现竞争状况,如documentation:
如上所述在get_or_create()中,此方法很容易出现 竞争条件,可能导致插入多行 同时,如果未在数据库级别强制实施唯一性。
您可以在模型中使用unique_together
,如另一个答案所示。我从未测试过这个,但显然是Django catches the IntegrityError
caused by these race conditions。
答案 1 :(得分:3)
我无法完全诊断此问题,因为没有默认设置,它仍然有机会意外行动(在我看来)。不过,我建议可能会考虑unique_together来强制执行数据库中的唯一性,这可能会强制将来字段的唯一性。
答案 2 :(得分:0)
源代码继续查找具有给定参数的唯一元素。
def get(self, *args, **kwargs):
"""
Perform the query and return a single object matching the given
keyword arguments.
"""
clone = self.filter(*args, **kwargs)
if self.query.can_filter() and not self.query.distinct_fields:
clone = clone.order_by()
num = len(clone)
if num == 1:
return clone._result_cache[0]
if not num:
raise self.model.DoesNotExist(
"%s matching query does not exist." %
self.model._meta.object_name
)
raise self.model.MultipleObjectsReturned(
"get() returned more than one %s -- it returned %s!" %
(self.model._meta.object_name, num)
)
您给出的参数肯定与两个对象类似。
data = {
'sessionid': request.session.session_key,
'ip_address': get_ip_address(request),
'user_agent': get_user_agent(request),
'method': request.method,
'url': request.build_absolute_uri(),
'transactionid': txn_details.txn_object,
'page': kwargs.get('page')
}
你想更新它们吗?你想保持条目的独特性吗?