如何在每次调用函数时缓存函数结果并更新缓存值?

时间:2019-08-23 02:11:02

标签: python multithreading asynchronous caching

我在这里寻求帮助:

我有一个由Python2.7编写的函数,返回结果花了很长时间,所以我想使用cache存储结果,每次调用该函数时,都应返回缓存中的值并且从函数返回的新结果应异步更新缓存值。这可能吗?

简称:

  1. 缓存功能结果。

  2. 每次调用该函数时,如果函数cache_key在高速缓存中,则返回高速缓存值;否则,返回默认值。同时,获取实时函数的返回值,并更新缓存。

我尝试过:

1。缓存工具

import time
from cachetools import cached, TTLCache
cache = TTLCache(maxsize=1, ttl=360)

@cached(cache)
def expensive_io():
    time.sleep(300)
    return 1.0

但是第一次调用expensive_io函数时,我仍然必须等待300秒,并且直到ttl超时才能更新缓存值。 ttl超时后,我必须再花300秒等待结果。

所以,我想知道我是否可以使用线程??

2。线程

from threading import Thread
import Queue

class asynchronous(object):
    def __init__(self, func, maxsize=128, cache=OrderedDict()):
        self.func = func
        self.maxsize = maxsize
        self.cache = cache
        self.currsize = len(cache)

        def getfuncthread(*args, **kwargs):
            key = self.cache_key("{0}-{1}-{2}".format(self.func.__name__, str(*args), str(**kwargs)))
            if self.currsize >= self.maxsize:
                self.cache.popitem(False)
            if not self.cache:
                self.cache[key] = func(*args, **kwargs)
                self.queue.put(self.cache[key])

        def returnthread(*args, **kwargs):
            key = self.cache_key("{0}-{1}-{2}".format(self.func.__name__, str(*args), str(**kwargs)))
            if key in self.cache:
                return self.cache[key]
            else:
                return 2222

        self.returnthread = returnthread
        self.getfuncthread = getfuncthread

    def cache_key(self, s):
        return hashlib.sha224(s).hexdigest()

    def __call__(self, *args, **kwargs):
        return self.func(*args, **kwargs)

    def start(self, *args, **kwargs):
        self.queue = Queue()
        thread1 = Thread(target=self.getfuncthread, args=args, kwargs=kwargs)
        thread2 = Thread(target=self.returnthread, args=args, kwargs=kwargs)
        thread1.start()
        thread2.start()
        return asynchronous.Result(self.queue, thread2)

    class NotYetDoneException(Exception):
        def __init__(self, message):
            self.message = message

    class Result(object):
        def __init__(self, queue, thread):
            self.queue = queue
            self.thread = thread

        def is_done(self):
            return not self.thread.is_alive()

        def get_result(self):
            if not self.is_done():
                raise asynchronous.NotYetDoneException('the call has not yet completed its task')

            if not hasattr(self, 'result'):
                self.result = self.queue.get()
            return self.result

@asynchronous
def expensive_io(n):
    time.sleep(300)
    return n*n

if __name__ == '__main__':
    # sample usage
    import time

    result1 = expensive_io.start(2)
    result2 = expensive_io.start(2)
    result3 = expensive_io.start(4)
    try:
        print "result1 {0}".format(result1.get_result())
        print "result2 {0}".format(result2.get_result())
        print "result3 {0}".format(result3.get_result())
    except asynchronous.NotYetDoneException as ex:
        print ex.message

我在想,异步装饰器中有两个线程:

returnThread用于从缓存中返回值,如果cache中的cache_key不存在,则立即返回默认值。

getfuncthread用于通过调用func来获取函数值,并将其放入缓存和队列中。

这似乎合乎逻辑,但仍然无法正常工作。

3。异步

我可以使用asyncio吗?但是python2.7不支持asyncio,我发现了trollius包。但是仍然不知道如何处理它。

任何想法都将不胜感激。

1 个答案:

答案 0 :(得分:0)

由于您被迫使用Python 2.7,因此您将无法访问PSF的现代库实现中的许多选项,包括那些具有asyncio(如您所说)和所需功能(functools)的选项。这是一个使用我的密友(https://github.com/iwalton3)创建的函数的解决方案,以及一些来自CPython github源码的样板代码:

def wraps(wrapped,
          assigned = WRAPPER_ASSIGNMENTS,
          updated = WRAPPER_UPDATES):
    """Decorator factory to apply update_wrapper() to a wrapper function

       Returns a decorator that invokes update_wrapper() with the decorated
       function as the wrapper argument and the arguments to wraps() as the
       remaining arguments. Default arguments are as for update_wrapper().
       This is a convenience function to simplify applying partial() to
       update_wrapper().
    """
    return partial(update_wrapper, wrapped=wrapped,
                   assigned=assigned, updated=updated)
class partial:
    """New function with partial application of the given arguments
    and keywords.
    """

    __slots__ = "func", "args", "keywords", "__dict__", "__weakref__"

    def __new__(cls, func, /, *args, **keywords):
        if not callable(func):
            raise TypeError("the first argument must be callable")

        if hasattr(func, "func"):
            args = func.args + args
            keywords = {**func.keywords, **keywords}
            func = func.func

        self = super(partial, cls).__new__(cls)

        self.func = func
        self.args = args
        self.keywords = keywords
        return self

    def __call__(self, /, *args, **keywords):
        keywords = {**self.keywords, **keywords}
        return self.func(*self.args, *args, **keywords)

    def __reduce__(self):
        return type(self), (self.func,), (self.func, self.args,
               self.keywords or None, self.__dict__ or None)

    def __setstate__(self, state):
        if not isinstance(state, tuple):
            raise TypeError("argument to __setstate__ must be a tuple")
        if len(state) != 4:
            raise TypeError(f"expected 4 items in state, got {len(state)}")
        func, args, kwds, namespace = state
        if (not callable(func) or not isinstance(args, tuple) or
           (kwds is not None and not isinstance(kwds, dict)) or
           (namespace is not None and not isinstance(namespace, dict))):
            raise TypeError("invalid partial state")

        args = tuple(args) # just in case it's a subclass
        if kwds is None:
            kwds = {}
        elif type(kwds) is not dict: # XXX does it need to be *exactly* dict?
            kwds = dict(kwds)
        if namespace is None:
            namespace = {}

        self.__dict__ = namespace
        self.func = func
        self.args = args
        self.keywords = kwds

def make_dynamic(function):
    cache = {}
    @wraps(function)
    def result(*args, clear_cache=False, ignore_cache=False, skip_cache=False, **kwargs):
        nonlocal cache
        call = (args, tuple(kwargs.items()))
        if clear_cache:
            cache = {}
        if call in cache and not ignore_cache:
            return cache[call]
        res = function(*args, **kwargs)
        if not skip_cache:
            cache[call] = res
        return res
    return result

总的来说,这意味着您应该能够在@make_dynamic行上方使用def functionName(args, ...):标签来修饰昂贵的IO计算。但是,请确保所有args都是可哈希的。对于不熟悉 hashability 的所有读者来说,这仅意味着您可以在解释器上使用hash(object_name)并返回唯一的整数。这些类型包括字符串,数字,元组等。