我有一个luigi
的数据管道,如果我让1个工作人员完成任务,那就完全正常了。但是,如果我把> 1个工作者,然后它在具有2个依赖项的阶段中死亡(意外地具有退出代码-11)。代码相当复杂,因此很难给出最小的例子。问题的关键是我用gensim
做了以下事情:
出于某种原因,即使(1)和(2)已经完成,每当我放置一个以上的工人时,步骤(3)也会崩溃......
非常感谢任何帮助!
编辑:以下是日志信息的示例。 TrainLDA是任务(3)。之后还有两项任务需要TrainLDA。所有早期任务都正确完成。我用...
替换了TrainLDA的参数,以便输出更具可读性。附加信息只是我们用print
语句来帮助我们了解正在发生的事情。
DEB
UG: Pending tasks: 3
DEBUG: Asking scheduler for work...
INFO: [pid 28851] Worker Worker(salt=514562349, workers=4, host=felipe.local, username=Felipe, pid=28825) running TrainLDA(...)
INFO: Done
INFO: There are no more tasks to run at this time
INFO: TrainLDA(...) is currently run by worker Worker(salt=514562349, workers=4, host=felipe.local, username=Felipe, pid=28825)
==============================
Corriendo LDA de spanish con nivel de limpieza stopwords
==============================
Número de tópicos: 40
DEBUG: Asking scheduler for work...
INFO: Done
INFO: There are no more tasks to run at this time
INFO: TrainLDA(...) is currently run by worker Worker(salt=514562349, workers=4, host=felipe.local, username=Felipe, pid=28825)
DEBUG: Asking scheduler for work...
INFO: Done
INFO: There are no more tasks to run at this time
INFO: TrainLDA(...) is currently run by worker Worker(salt=514562349, workers=4, host=felipe.local, username=Felipe, pid=28825)
INFO: Worker task TrainLDA(...) died unexpectedly with exit code -11
DEBUG: Asking scheduler for work...
INFO: Done
INFO: There are no more tasks to run at this time
INFO: There are 2 pending tasks possibly being run by other workers
INFO: There are 2 pending tasks unique to this worker
INFO: Worker Worker(salt=514562349, workers=4, host=felipe.local, username=Felipe, pid=28825) was stopped. Shutting down Keep-Alive thread
答案 0 :(得分:0)
显然,在分叉线程调用_scproxy.so时,这会在Mac OSX上发生,从而触发dispatchlib,这是不安全的。
可能的解决方法是在设置了no_proxy的环境中调用Luigi:
luigi_env = os.environ.copy()
luigi_env['no_proxy'] = "'*'"
subprocess.Popen(luigi_command, env=luigi_env)