确保只有一名工作人员在运行多个工作人员的金字塔Web应用程序中启动apscheduler事件

时间:2013-04-17 06:48:53

标签: django pyramid wsgi gunicorn apscheduler

我们有一个用金字塔制作的网络应用程序,通过gunicorn + nginx提供服务。它适用于8个工作线程/进程

我们需要工作,我们选择了apscheduler。这是我们如何推出它

from apscheduler.events import EVENT_JOB_EXECUTED, EVENT_JOB_ERROR
from apscheduler.scheduler import Scheduler

rerun_monitor = Scheduler()
rerun_monitor.start()
rerun_monitor.add_interval_job(job_to_be_run,\
            seconds=JOB_INTERVAL)

问题是gunicorn的所有工作进程都选择了调度程序。我们尝试实现文件锁,但它似乎不是一个足够好的解决方案。什么是最好的方法来确保在任何给定的时间只有一个工作进程选择预定的事件,没有其他线程选择它直到下一个JOB_INTERVAL

如果我们决定稍后切换到apache2 + modwsgi,解决方案甚至需要使用mod_wsgi。它需要与作为服务员的单进程开发服务器一起工作。

来自赏金赞助商的更新

我正面临OP所描述的相同问题,只需使用Django应用程序。我最有把握的是,如果原始问题,这个细节不会有太大变化。出于这个原因,为了获得更多的可见性,我还使用django标记了这个问题。

3 个答案:

答案 0 :(得分:19)

Because Gunicorn is starting with 8 workers (in your example), this forks the app 8 times into 8 processes. These 8 processes are forked from the Master process, which monitors each of their status & has the ability to add/remove workers.

Each process gets a copy of your APScheduler object, which initially is an exact copy of your Master processes' APScheduler. This results in each "nth" worker (process) executing each job a total of "n" times.

A hack around this is to run gunicorn with the following options:

env/bin/gunicorn module_containing_app:app -b 0.0.0.0:8080 --workers 3 --preload

The --preload flag tells Gunicorn to "load the app before forking the worker processes". By doing so, each worker is "given a copy of the app, already instantiated by the Master, rather than instantiating the app itself". This means the following code only executes once in the Master process:

rerun_monitor = Scheduler()
rerun_monitor.start()
rerun_monitor.add_interval_job(job_to_be_run,\
            seconds=JOB_INTERVAL)

Additionally, we need to set the jobstore to be anything other than :memory:.This way, although each worker is its own independent process unable of communicating with the other 7, by using a local database (rather then memory) we guarantee a single-point-of-truth for CRUD operations on the jobstore.

from apscheduler.schedulers.background import BackgroundScheduler
from apscheduler.jobstores.sqlalchemy import SQLAlchemyJobStore

rerun_monitor = Scheduler(
    jobstores={'default': SQLAlchemyJobStore(url='sqlite:///jobs.sqlite')})
rerun_monitor.start()
rerun_monitor.add_interval_job(job_to_be_run,\
            seconds=JOB_INTERVAL)

Lastly, we want to use the BackgroundScheduler because of its implementation of start(). When we call start() in the BackgroundScheduler, a new thread is spun up in the background, which is responsible for scheduling/executing jobs. This is significant because remember in step (1), due to our --preload flag we only execute the start() function once, in the Master Gunicorn process. By definition, forked processes do not inherit the threads of their Parent, so each worker doesn't run the BackgroundScheduler thread.

from apscheduler.jobstores.sqlalchemy import SQLAlchemyJobStore

rerun_monitor = BackgroundScheduler(
    jobstores={'default': SQLAlchemyJobStore(url='sqlite:///jobs.sqlite')})
rerun_monitor.start()
rerun_monitor.add_interval_job(job_to_be_run,\
            seconds=JOB_INTERVAL)

As a result of all of this, every Gunicorn worker has an APScheduler that has been tricked into a "STARTED" state, but actually isn't running because it drops the threads of it's parent! Each instance is also capable of updating the jobstore database, just not executing any jobs!

Check out flask-APScheduler for a quick way to run APScheduler in a web-server (like Gunicorn), and enable CRUD operations for each job.

答案 1 :(得分:13)

我找到了一个与Django项目有关的修复程序。我只是在调度程序第一次启动时绑定TCP套接字,然后再对其进行检查。我认为以下代码也适用于您,并进行了一些小调整。

import sys, socket

try:
    sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
    sock.bind(("127.0.0.1", 47200))
except socket.error:
    print "!!!scheduler already started, DO NOTHING"
else:
    from apscheduler.schedulers.background import BackgroundScheduler
    scheduler = BackgroundScheduler()
    scheduler.start()
    print "scheduler started"

答案 2 :(得分:0)

简短的回答:您不能正确地做到这一点而不会产生后果。

我以Gunicorn为例,但对于uWSGI来说基本上是相同的。在运行多个进程时,有各种各样的技巧,例如:

  1. 使用--preload选项
  2. 使用on_starting钩子启动APScheduler后台调度程序
  3. 使用when_ready钩子启动APScheduler后台调度程序

它们在一定程度上起作用,但可能会出现以下错误:

  1. 工人经常超时
  2. 没有工作https://github.com/agronholm/apscheduler/issues/305时,调度程序挂起

APScheduler设计为在单个过程中运行,在该过程中,它可以完全控制向作业存储中添加作业的过程。它使用threading.Event的{​​{1}}和wait()方法进行协调。如果它们由不同的流程运行,则协调将无法进行。

可以在Gunicorn中单个运行它。

  1. 仅使用一个工作进程
  2. 使用set()钩子启动调度程序,这将确保调度程序仅在工作进程中运行,而不在主进程中运行

作者还指出,无法共享多个过程的作业存储量。 https://apscheduler.readthedocs.io/en/stable/faq.html#how-do-i-share-a-single-job-store-among-one-or-more-worker-processes他还提供了使用RPyC的解决方案。

使用REST接口包装APScheduler是完全可行的。您可能要考虑将其作为一个独立的应用程序与一个工作人员一起提供。换句话说,如果您还有其他终结点,请将它们放在另一个可以使用多个工作程序的应用程序中。