Question

在评论中：Is there a decorator to simply cache function return values?

@gerrit指出使用可变但可散列的对象与functools.lru_cache装饰器的函数的问题：

如果我传递一个hashable，mutable参数，并更改了该值对象在第一次调用该函数后，第二次调用即可返回已更改的对象，而不是原始对象。这几乎可以肯定不是用户想要的。

根据我的理解，假设可变对象的__hash__()函数被手动定义为散列成员变量（而不仅仅是使用默认的对象id()对于自定义对象），更改参数对象将更改哈希，因此，lru_cache修饰函数应该第二次调用不应该使用缓存。

如果为可变参数正确定义了__hash__()函数，那么使用lru_cache修饰函数的可变参数会产生任何无人看管的行为吗？

Answer 1

我的评论错误/误导，与lru_cache无关，而是与任何创建更常用的缓存函数的尝试有关。

我面临着一个缓存函数的需求，该函数适用于输入和输出NumPy数组的函数，这些数组是可变的且不可清除的。由于NumPy数组不可清，因此我无法使用functools.lru_cache。我最后写了这样的话：

def mutable_cache(maxsize=10):
    """In-memory cache like functools.lru_cache but for any object

    This is a re-implementation of functools.lru_cache.  Unlike
    functools.lru_cache, it works for any objects, mutable or not.
    Therefore, it returns a copy and it is wrong if the mutable
    object has changed!  Use with caution!

    If you call the *resulting* function with a keyword argument
    'CLEAR_CACHE', the cache will be cleared.  Otherwise, cache is rotated
    when more than `maxsize` elements exist in the cache.  Additionally,
    if you call the resulting function with NO_CACHE=True, it doesn't
    cache at all.  Be careful with functions returning large objects.
    Everything is kept in RAM!

    Args:
        maxsize (int): Maximum number of return values to be remembered.

    Returns:
        New function that has caching implemented.
    """

    sentinel = object()
    make_key = functools._make_key

    def decorating_function(user_function):
        cache = {}
        cache_get = cache.get
        keylist = []  # don't make it too long

        def wrapper(*args, **kwds):
            if kwds.get("CLEAR_CACHE"):
                del kwds["CLEAR_CACHE"]
                cache.clear()
                keylist.clear()
            if kwds.get("NO_CACHE"):
                del kwds["NO_CACHE"]
                return user_function(*args, **kwds)
            elif "NO_CACHE" in kwds:
                del kwds["NO_CACHE"]
            key = str(args) + str(kwds)
            result = cache_get(key, sentinel)
            if result is not sentinel:
                # make sure we return a copy of the result; when a = f();
                # b = f(), users should reasonably expect that a is not b.
                return copy.copy(result)
            result = user_function(*args, **kwds)
            cache[key] = result
            keylist.append(key)
            if len(keylist) > maxsize:
                try:
                    del cache[keylist[0]]
                    del keylist[0]
                except KeyError:
                    pass
            return result

        return functools.update_wrapper(wrapper, user_function)

    return decorating_function

在我的第一个版本中，我省略了copy.copy()函数（应该是copy.deepcopy()），如果我更改了结果值然后调用了缓存函数，则会导致错误。在我添加copy.copy()功能之后，我意识到我在某些情况下占用内存，主要是因为我的函数计算对象而不是总内存使用量，这在Python中通常很简单（尽管应该很容易）如果仅限于NumPy数组）。因此，我在生成的函数中添加了NO_CACHE和CLEAR_CACHE个关键字，这些关键字与其名称相符。

在编写并使用此函数之后，我理解functools.lru_cache只能用于具有hashable输入参数的函数的原因不止一个。任何需要使用可变参数的缓存函数的人都需要非常小心。

使用可变参数到`lru_cache`修饰函数可能会遇到什么困难？

1 个答案: