我编写了一个程序,它使用while循环永远运行,并由config解析器读取的.cfg文件提供休眠间隔。
它运行良好,大约二十六天左右。然后它停止运行,但当然是因为它作为服务启动而保持运行。此外,当时我没有考虑将主循环包装在try异常块中并使用import syslog进行记录。
下面的代码示例仅包含主要块。我没有包括其余部分,因为大部分只是一个典型的任务队列,结果队列构成了多处理模块。
什么可能导致这种行为?我的网络设备对象是否被垃圾收集了,因为它们没有通过while循环进行实例化?这只是编写/设计长期运行的Python程序的一种不好的方法吗?
if __name__ == '__main__':
#
#Hold results in the multiprocessing queues
#
monitor_results = ''
#
#Our task is to monitor and this will hold our tasks
#
monitors = []
#
# list of network devices represented as
# objects that will be monitored
#
device_list = []
#
# The addresses of the devices are provided by the config parser's
# .cfg file
#
device_addresses = list(config['monitored']['devices'].split(','))
for address in device_addresses:
password = get_password(address)
device_list.append( Device(address, 'admin', password))
for d in device_list:
path = ['sys', 'clock']
request = Transport(headers, timeout=20)
request.http.credentials.add(user, passwd)
request.url = DeviceUri(d.mgmt_address, path ).uri
monitors.append(request)
while True:
tasks = multiprocessing.JoinableQueue(maxsize=len(monitors) + 1)
results = multiprocessing.Queue()
num_consumers = multiprocessing.cpu_count() * 2
consumers = [Consumer(tasks, results) for i in range(num_consumers)]
for w in consumers:
w.start()
for monitor in monitors:
tasks.put(Monitor(monitor))
for i in range(num_consumers):
tasks.put(None)
tasks.join()
count = 0
while not results.empty():
result = results.get()
if result is not None:
monitor_results += result + '\n'
count += 1
if count > 0:
mail_result = send_email( monitor_results )
#
#reset the monitor results or it will keep sending all previous results
#
monitor_results = ''
time.sleep(poll_interval)
答案 0 :(得分:0)
我找到了答案。由于我通过logrotate.d旋转了生成的日志,因此他们删除了日志而不是截断日志。当计时器重新启动循环时,程序找不到要写入和退出的日志文件。因此,我将logrotate.d重新配置为'copytruncate'而不是create。