在指定时间后重试循环位置

时间:2015-01-28 18:00:19

标签: python python-2.7 timeout signals subprocess

如果在子进程友好的指定时间之后没有完成,有没有办法重新启动部分循环?

我有一个脚本,说main.py使用subprocess.Popen()生成其他工作脚本的X实例。每个“工作者”实质上都在检查它在Azure上托管的各自队列的作业(每个工作者和队列提供不同的功能和作业)。

问题是andy工作者(andy.py)有时会挂起调用函数的while循环的某个部分。我已经尝试使用SIGALRM中断它正在做的事情,这引发了一个只调用pass的异常。 signal.alarm()中止了它的尝试,这反过来导致它只是重试搜索日期,因为它在while循环中。

问题是,它看起来像触发警报时,它有时也会影响正在运行的完全不同的子进程并中断它正在做的事情。我想要的是如果函数需要超过X秒才能完成,请尝试再次运行该函数。

以下是代码外观的示例(已授予,已将代码替换为可供任何人运行的代码,并且我已删除了运行除andy之外的其他工作人员的所有功能):

main.py

import subprocess as sp
import sys
import time
import datetime
import thread


max_workers = {'andy': 10}


def check():
    workers = {'andy': {}}

    while True:
        for worker, instances in workers.items():
            while len(instances) < max_workers[worker]:
                process = sp.Popen(['python', 'workers/%s.py' % worker], shell=False)
                workers[worker][process] = process.pid
        for worker, instances in workers.items():
            for process, pid in instances.items():
                if process.poll() is not None:
                    del workers[worker][process]


def time_check():
    global max_workers
    start = datetime.time(hour=07, minute=05)
    end = datetime.time(hour=23, minute=00)
    while 1:
        now = datetime.datetime.now().time().replace(second=0, microsecond=0)
        if now == start:
            time.sleep(60)
            max_workers['andy'] = 7
        elif now == end:
            time.sleep(60)
            max_workers['andy'] = 0
        else:
            time.sleep(1)


if __name__ == "__main__":
    while 1:
        try:
            thread.start_new_thread(check, ())
            thread.start_new_thread(time_check, ())
        except KeyboardInterrupt:
            sys.exit(0)

andy.py

import datetime
import otas
import json
import time
import signal


def alarm_handler():
    pass


def start():

    resort_ids = 'Los Angeles', 'New York', 'Chicago', 'Miami'
    start_date = datetime.datetime.now()
    end_date = start_date + datetime.timedelta(days=10)
    ota = otas.Expedia(headless=False)
    signal.signal(signal.SIGALRM, alarm_handler)
    for resort_id in resort_ids:
        search_date = start_date
        while search_date < end_date:
            signal.alarm(15)
            try:
                data = ota.search_by_date(resort=resort_id, checkin=search_date)
            except:
                pass
            else:
                try:
                    print data
                except TypeError:
                    pass
                search_date += datetime.timedelta(days=1)


if __name__ == '__main__':
    start()

otas.py

from selenium import webdriver
import datetime


class Expedia:
    def __init__(self, headless=True):
        if headless is True:
            self.driver = webdriver.PhantomJS()
        else:
            self.driver = webdriver.Firefox()


    def search_by_date(self, resort, checkin, flexibility=4, nights=3):
        driver = self.driver
        try:
            driver.get(
                'http://www.expedia.com/Hotel-Search?#&destination={0}&startDate={1}&endDate={2}'.format(
                    resort, checkin.strftime("%m/%d/%Y"), (checkin + datetime.timedelta(days=1)).strftime("%m/%d/%Y")
                )
            )
            return driver.page_source
        except Exception, e:
            return e

EDIT4:重写了用户可重复的问题和代码,并且更加清晰。

2 个答案:

答案 0 :(得分:0)

我没有您的特定于流程的代码 - 但这是我尝试过的,似乎有效:

sp_worker.py

import time
import signal
try:
    time.sleep(60)
    print("finished")
except: 
    print("got interupted")

__main__

import subprocess as sp
import signal
proc1 = sp.Popen(['python', 'sp_worker.py'], stdout=sp.PIPE) 
proc2 = sp.Popen(['python', 'sp_worker.py'], stdout=sp.PIPE)
proc1.send_signal(signal.SIGALRM)
proc1.communicate() 
(b'got interupted\n', None)
proc2.communicate() #(blocks)

答案 1 :(得分:0)

您的代码中存在多个问题。这是一个启动无限数量线程的片段。

#XXX BROKEN, DO NOT DO IT
while 1:
    try:
        thread.start_new_thread(check, ())
        thread.start_new_thread(time_check, ())
    except KeyboardInterrupt:
        sys.exit(0)

OS资源是有限的。如果内存在其他资源(每个线程需要一个堆栈)之前耗尽;一些操作系统可能会开始杀死不相关的进程(OOM Killer)。

您的意图可能是同时重复运行两个功能:

from multiprocessing.pool import ThreadPool

pool = ThreadPool(2)
while True:
    try:
        r = pool.apply_async(check)
        pool.apply(time_check) # block until time_check() returns
        r.get() # block until check() returns
    except KeyboardInterrupt:
        break

包装check()time_check()函数以捕获并记录所有异常。