递归检查数组以查找新元素并触发操作

时间:2019-01-14 19:04:45

标签: python arrays multithreading file

我正在运行一个脚本,该脚本每60秒不断生成文件,然后将其存储在数组中:

my_files = [file1, file2, ..., filen]

由于该数组保持动态增长,因此我需要递归地继续检查此文件是否有新文件。每当有新文件时,我都需要将它们添加到“队列”中,并等待通过调用其他函数来处理它们。

做到这一点的最佳方法是什么?我已经看过看门狗了,但是对于我所需要的看似非常复杂。我还尝试使用以下类,该类将允许我递归检查文件夹,但是由于它不是list,因此我无法遍历“获取”的文件:

class RepeatedTimer(object):

    def __init__(self, interval, function, *args, **kwargs):
        self._timer = None
        self.interval = interval
        self.function = function
        self.args = args
        self.kwargs = kwargs
        self.is_running = False
        self.next_call = time.time()
        self.start()

     def _run(self):
        self.is_running = False
        self.start()
        self.function(*self.args, **self.kwargs)

     def start(self):
         if not self.is_running:
             self.next_call += self.interval
             self._timer = threading.Timer(self.next_call - time.time(), self._run)
             self._timer.start()
             self.is_running = True

     def stop(self):
         self._timer.cancel()
         self.is_running = False

以下内容无效,因此我无法访问文件:

my_files = [RepeatedTimer(30, pick_testing_files)]

while True:
    if len(my_files) > 0:
        files_to_be_processed = my_files.pop()
        threading.Thread(target=test, args=(files_to_be_processed)).start()

1 个答案:

答案 0 :(得分:0)

像这样的事情似乎奏效了(据我自己的有限测试表明)。我添加了threading.Lock,以保护共享资源my_files不受并发访问。

import random
import string
import threading
import time


class RepeatedTimer:
    def __init__(self, interval, function, *args, **kwargs):
        self._timer = None
        self.interval = interval
        self.function = function
        self.args = args
        self.kwargs = kwargs
        self.is_running = False
        self.next_call = time.time()
        self.start()

    def _run(self):
        self.is_running = False
        self.start()
        self.function(*self.args, **self.kwargs)

    def start(self):
        if not self.is_running:
            self.next_call += self.interval
            self._timer = threading.Timer(self.next_call - time.time(), self._run)
            self._timer.start()
            self.is_running = True

    def stop(self):
        self._timer.cancel()
        self.is_running = False


def pick_testing_files(filenames, lock):
    newfiles = []
    with lock:
        while True:
            try:
                filename = filenames.pop()
            except IndexError:  # List is empty.
                break
            newfiles.append(filename)

    print('files retrieved:', newfiles)

def random_filename():
    letters = []
    for _ in range(random.randint(4, 8)):
        letters.append(random.choice(string.ascii_lowercase))
    return ''.join(letters)


my_files = []
filelist_lock = threading.Lock()

watcher = RepeatedTimer(5, pick_testing_files, my_files, filelist_lock)
#watcher.start()  # Not needed because RepeatedTimer's start themselves.

for _ in range(30):  # Run test for 30 seconds.
    # Add some filenames to my_files list.
    with filelist_lock:
        for _ in range(random.randint(1, 4)):  # Generate some filenames.
            my_files.append(random_filename())
    print('new files added')
    time.sleep(1)  # Wait a little before adding more filenames.

watcher.cancel()