Question

基本上，我在Flask中有一台服务器需要进行大量比较，因此我需要将它们并行化。
我的目标是在服务器启动时创建工作者池。
一段时间后，当它收到一些信息时，它使用工作池将这个新信息与数据库进行比较。
到现在为止我还没有任何问题，但是我像这样处理工人：

服务器启动时，脚本config.py中的函数使用全局变量worker_pool创建工作线程

worker_pool = None

def run():
    global worker_pool
    worker_pool = Pool(10)

在另一个脚本中，我只是这样做：

SimilarityComputed = config.worker_pool.map(self.compute_similarity_parallel, CrashToCompute)

其中CrashToCompute是正确格式的不同比较的列表

我不确定使用全局变量是执行此操作的最佳方法，但是我无法将其作为参数传递。最后，我收到此错误：

Traceback (most recent call last):
  File "...\AppData\Local\Programs\Python\Python37\lib\unittest\case.py", line 59, in testPartExecutor
    yield
  File "...\AppData\Local\Programs\Python\Python37\lib\unittest\case.py", line 628, in run
    testMethod()
  File "...\src\Tests\CrashLogClassifierTests.py", line 17, in test_UpdateComputeSimilarity
    classifier.compute_all_similarity(db, True)
  File "...\src\CrashLogClassifier.py", line 70, in compute_all_crash_similarity
    self.add_crash_similarity_parallel(db, CrashToCompute, overwrite)
  File "...\src\CrashLogClassifier.py", line 100, in add_crash_similarity_parallel
    SimilarityComputed = config.worker_pool.map(self.compute_similarity_parallel, CrashToCompute)  # Sending the work to the pools, it sends the data by pairs of dictionaries
  File "...\AppData\Local\Programs\Python\Python37\lib\multiprocessing\pool.py", line 268, in map
    return self._map_async(func, iterable, mapstar, chunksize).get()
  File "...\AppData\Local\Programs\Python\Python37\lib\multiprocessing\pool.py", line 657, in get
    raise self._value
AttributeError: 'SequenceMatcher' object has no attribute 'matching_blocks'

如果有人有更好的方法来执行此操作或解决此错误，那么我会坚持的。

在Python脚本中创建多进程，然后在另一个脚本中使用它

0 个答案: