我在python中有一个fore firestore函数,其中我为一个集合的所有用户执行了一个for循环,然后进入另一个集合以获取一些指标,并在第一个集合中更新此指标。
我运行了该函数,但是在执行过程中的某些时候,该函数中断了,给我这个错误:
_Rendezvous Traceback (most recent call last)
~\Anaconda3\envs\work\lib\site-packages\google\api_core\grpc_helpers.py in next(self)
78 try:
---> 79 return six.next(self._wrapped)
80 except grpc.RpcError as exc:
~\Anaconda3\envs\work\lib\site-packages\grpc\_channel.py in __next__(self)
363 def __next__(self):
--> 364 return self._next()
365
~\Anaconda3\envs\work\lib\site-packages\grpc\_channel.py in _next(self)
346 else:
--> 347 raise self
348 while True:
_Rendezvous: <_Rendezvous of RPC that terminated with:
status = StatusCode.DEADLINE_EXCEEDED
details = "Deadline Exceeded"
debug_error_string = "{"created":"@1570660422.708000000","description":"Error received from peer ipv4:216.58.202.234:443","file":"src/core/lib/surface/call.cc","file_line":1052,"grpc_message":"Deadline Exceeded","grpc_status":4}"
>
The above exception was the direct cause of the following exception:
DeadlineExceeded Traceback (most recent call last)
<ipython-input-20-05c9cefdafb4> in <module>
----> 1 update_collection__persons()
<ipython-input-19-6e2bdd597a6e> in update_collection__persons()
10 counter_secs = 0
11
---> 12 for person_doc in person_docs:
13 person_dict = person_doc.to_dict()
14 last_updated = person_dict['last_updated']
~\Anaconda3\envs\work\lib\site-packages\google\cloud\firestore_v1\query.py in stream(self, transaction)
766 )
767
--> 768 for response in response_iterator:
769 if self._all_descendants:
770 snapshot = _collection_group_query_response_to_snapshot(
~\Anaconda3\envs\work\lib\site-packages\google\api_core\grpc_helpers.py in next(self)
79 return six.next(self._wrapped)
80 except grpc.RpcError as exc:
---> 81 six.raise_from(exceptions.from_grpc_error(exc), exc)
82
83 # Alias needed for Python 2/3 support.
~\Anaconda3\envs\work\lib\site-packages\six.py in raise_from(value, from_value)
DeadlineExceeded: 504 Deadline Exceeded
我一直在寻找解决方案,信息不多,在这里我发现了类似的问题:https://github.com/googleapis/google-cloud-python/issues/8933
因此,我尝试使用此代码,但无法正常工作。这是我的功能:
def update_collection__persons():
persons = db.collection(u'collections__persons')
person_docs = persons.stream()
counter_secs = 0
for person_doc in person_docs:
person_dict = person_doc.to_dict()
last_updated = person_dict['last_updated']
last_processed = person_dict['last_processed']
dt_last_updated = datetime(1, 1, 1) + timedelta(microseconds=last_updated/10)
dt_last_processed = datetime(1, 1, 1) + timedelta(microseconds=last_processed/10)
if dt_last_processed < dt_last_updated:
orders = db.collection(u'collection__orders').where(u'email', u'==', person_dict['email'])
orders_docs = orders.stream()
sum_price = 0
count = 0
date_add_list = []
for order_doc in orders_docs:
order_dict = order_doc.to_dict()
sum_price += order_dict['total_price']
count +=1
date_add_list.append(order_dict['dateAdded'])
if count > 0:
data = {'metrics': {'LTV': sum_price,
'AOV': sum_price/count,
'Quantity_orders': count,
'first_order_date': min(date_add_list),
'last_order_date': max(date_add_list)},
'last_processed': int((datetime.utcnow() - datetime(1, 1, 1)).total_seconds() * 10000000)}
db.collection(u'collection__persons').document(person_dict['email']).set(data, merge = True)
我创建了一个counter_secs只是为了查看该函数是否总是在同一查询中中断,而事实并非如此。
运行该功能后,如果我看到其中一些用户是随机用户,我也已经更新了他们的数据,因此它可以正常工作,但是在某些时候中断了
答案 0 :(得分:2)
persons.stream()
有60秒的超时时间。尝试在流式传输之前处理所有文档,而不是处理每个文档:
person_docs = [snapshot for snapshot in persons.stream()]
如果文档数量超出60秒内的提取量,请尝试使用递归函数like in this answer。
与订单相同:
orders_docs = [snapshot for snapshot in orders.stream()]
答案 1 :(得分:0)
当我获取所有文档以将其转换为JSON时遇到了确切的问题。
我做到了。
def load_documents(self, path):
collection = self.db
nodes = path.split("/")
for i, node in enumerate(nodes):
if i % 2 == 0:
collection = collection.collection(node)
else:
collection = collection.document(node)
stream = collection.stream()
for doc in stream:
print("* Fetching document: {}".format(doc.get("filename")))
self.memes.append(self._fetch_doc(doc))
def _fetch_doc(self, doc):
try:
return {
"caption": doc.get("caption"),
"filename": doc.get("filename"),
"url": doc.get("url")
}
except:
self._fetch_doc(doc)
如果遇到异常,我将以递归方式获取。
答案 2 :(得分:0)
在遵循@juan-lara 的解决方案后,我仍然面临这个问题,将文档转换为 dict 最终对我有用。
person_docs = [snapshot.to_dict() for snapshot in persons.stream()]