Question

特别是在使用递归代码时，price visits sales 0 1399.99 2 0 1 169.99 2 0 2 99.99 1 0 3 99.99 1 0 4 139.99 1 0 5 319.99 1 0 6 198.99 1 0 7 119.99 1 0 8 39.99 1 0 9 259.98 1 0会有很大的改进。我知道缓存是一个空间，用于存储必须快速提供的数据并保存计算机不会重新计算。

functools的 Python lru_cache如何在内部运作？

我正在寻找一个特定的答案，它是否使用像其他Python一样的词典？它是否只存储lru_cache值？

我知道 Python 是建立在词典之上的，但是，我无法找到这个问题的具体答案。希望有人可以在 StackOverflow 上为所有用户简化此答案。

Answer 1

这里提供了functools的来源：https://github.com/python/cpython/blob/3.6/Lib/functools.py

Lru_cache装饰器在上下文中有_make_key字典（每个修饰的函数都有自己的缓存dict），它保存被调用函数的返回值。根据参数使用# one of decorator variants from source: def _lru_cache_wrapper(user_function, maxsize, typed, _CacheInfo): sentinel = object() # unique object used to signal cache misses cache = {} # RESULTS SAVES HERE cache_get = cache.get # bound method to lookup a key or return None # ... def wrapper(*args, **kwds): # Simple caching without ordering or size limit nonlocal hits, misses key = make_key(args, kwds, typed) # BUILD A KEY FROM ARGUMENTS result = cache_get(key, sentinel) # TRYING TO GET PREVIOUS CALLS RESULT if result is not sentinel: # ALREADY CALLED WITH PASSED ARGUMENTS hits += 1 return result # RETURN SAVED RESULT # WITHOUT ACTUALLY CALLING FUNCTION result = user_function(*args, **kwds) # FUNCTION CALL - if cache[key] empty cache[key] = result # SAVE RESULT misses += 1 return result # ... return wrapper函数生成字典键。添加了一些大胆的评论：

Cull Off

Answer 2

LRU缓存的Python 3.9源代码：https://github.com/python/cpython/blob/3.9/Lib/functools.py#L429

示例Fib代码

@lru_cache(maxsize=2)
def fib(n):
    if n == 0:
        return 0
    if n == 1:
        return 1
    return fib(n - 1) + fib(n - 2)

LRU缓存装饰器检查一些基本情况，然后用包装器_lru_cache_wrapper包装用户功能。在包装器内部，发生了将项目添加到缓存的逻辑，LRU逻辑（即，将新项目添加到循环队列中），从循环队列中删除项目的过程。

def lru_cache(maxsize=128, typed=False):
...
    if isinstance(maxsize, int):
        # Negative maxsize is treated as 0
        if maxsize < 0:
            maxsize = 0
    elif callable(maxsize) and isinstance(typed, bool):
        # The user_function was passed in directly via the maxsize argument
        user_function, maxsize = maxsize, 128
        wrapper = _lru_cache_wrapper(user_function, maxsize, typed, _CacheInfo)
        wrapper.cache_parameters = lambda : {'maxsize': maxsize, 'typed': typed}
        return update_wrapper(wrapper, user_function)
    elif maxsize is not None:
        raise TypeError(
         'Expected first argument to be an integer, a callable, or None')

    def decorating_function(user_function):
        wrapper = _lru_cache_wrapper(user_function, maxsize, typed, _CacheInfo)
        wrapper.cache_parameters = lambda : {'maxsize': maxsize, 'typed': typed}
        return update_wrapper(wrapper, user_function)

    return decorating_function

lru_cache规范化了maxsize(when negative)，添加了CacheInfo详细信息，最后添加了包装器并更新了装饰器文档和其他详细信息。

lru_cache_wrapper

Lru Cache包装器几乎没有记账变量。

 sentinel = object()          # unique object used to signal cache misses
 make_key = _make_key         # build a key from the function arguments
 PREV, NEXT, KEY, RESULT = 0, 1, 2, 3   # names for the link fields

 cache = {}
 hits = misses = 0
 full = False
 cache_get = cache.get    # bound method to lookup a key or return None
 cache_len = cache.__len__  # get cache size without calling len()
 lock = RLock()           # because linkedlist updates aren't threadsafe
 root = []                # root of the circular doubly linked list
 root[:] = [root, root, None, None]     # initialize by pointing to self

包装器在执行任何操作之前先获取锁。
一些重要的变量-根列表包含遵守maxsize值的所有项目。记住根的重要概念是在上一个（0）和下一个位置（1）自引用(root[:] = [root, root, None, None])

三个高级检查

第一种情况，当maxsize为0时，表示没有缓存功能，包装器包装了没有任何缓存功能的用户功能。包装器增加缓存未命中计数并返回结果。
```
 def wrapper(*args, **kwds):
     # No caching -- just a statistics update
     nonlocal misses
     misses += 1
     result = user_function(*args, **kwds)
     return result
```

第二种情况。当maxsize为None时。在本节中，对要存储在缓存中的元素数没有限制。因此，包装器检查缓存（字典）中的键。存在键时，包装器将返回值并更新缓存命中信息。当缺少键时，包装器将使用用户传递的参数调用用户函数，更新缓存，更新缓存未命中信息并返回结果。

 def wrapper(*args, **kwds):
     # Simple caching without ordering or size limit
     nonlocal hits, misses
     key = make_key(args, kwds, typed)
     result = cache_get(key, sentinel)
     if result is not sentinel:
         hits += 1
         return result
     misses += 1
     result = user_function(*args, **kwds)
     cache[key] = result
     return result

第三种情况，当maxsize是默认值（128）或用户传递的整数值时。这是实际的LRU缓存实现。包装器中的整个代码以线程安全的方式进行。在执行任何操作之前，请从缓存the wrapper obtains RLock中读取/写入/删除。

LRU缓存

缓存中的值存储为四个项的列表（记住根）。第一项是对上一项的引用，第二项是对下一项的引用，第三项是特定函数调用的键，第四项是结果。这是斐波那契函数参数1 [[[...], [...], 1, 1], [[...], [...], 1, 1], None, None]的实际值。 [...]表示对自身（列表）的引用。
第一个检查是针对缓存命中。如果是，则缓存中的值是四个值的列表。
```
 nonlocal root, hits, misses, full
 key = make_key(args, kwds, typed)
 with lock:
     link = cache_get(key)
      if link is not None:
          # Move the link to the front of the circular queue
          print(f'Cache hit for {key}, {root}')
          link_prev, link_next, _key, result = link
          link_prev[NEXT] = link_next
          link_next[PREV] = link_prev
          last = root[PREV]
          last[NEXT] = root[PREV] = link
          link[PREV] = last
          link[NEXT] = root
          hits += 1
          return result
```
当项目已在缓存中时，无需检查循环队列是否已满或从缓存中弹出项目。而是更改项目在循环队列中的位置。由于最近使用的项目始终位于顶部，因此代码将移至最近值到队列的顶部，并且前一个顶部项目成为当前项目last[NEXT] = root[PREV] = link和link[PREV] = last和{{1}的下一个项目}。 NEXT和PREV在顶部初始化，指向列表link[NEXT] = root中的适当位置。最后，增加缓存命中信息并返回结果。
当它是缓存未命中时，更新未命中信息，然后代码检查三种情况。所有三个操作都在获得RLock之后发生。源代码中有以下三种情况，顺序如下-在高速缓存中找到锁定键后，高速缓存已满，并且高速缓存可以使用新项目。为了演示，让我们按照以下顺序进行操作：当缓存未满时，缓存已满，并且在获取锁之后，密钥在缓存中可用。

当缓存未满

PREV, NEXT, KEY, RESULT = 0, 1, 2, 3 # names for the link fields

当缓存未满时，请准备最近的... else: # Put result in a new link at the front of the queue. last = root[PREV] link = [last, root, key, result] last[NEXT] = root[PREV] = cache[key] = link # Use the cache_len bound method instead of the len() function # which could potentially be wrapped in an lru_cache itself. full = (cache_len() >= maxsize)以包含根的先前引用，根，键和计算结果。
然后将最近的结果（链接）指向循环队列（result(link = [last, root, key, result])）的顶部，将根的上一个项目的下一个指向最近的结果（root[PREV] = link），然后添加最近的结果结果存入缓存（last[NEXT]=link）。
最后，检查缓存是否已满（cache[key] = link）并将状态设置为已满。
对于fib示例，当函数接收到第一个值cache_len() >= maxsize and cache_len = cache.__len__ is declared in the top时，root为空，root值为1，并将结果添加到循环队列中后，root值为[[...], [...], None, None]。上一个和下一个都指向键[[[...], [...], 1, 1], [[...], [...], 1, 1], None, None]的结果。对于下一个值1，插入后的根值为

0。前一个是[[[[...], [...], 1, 1], [...], 0, 0], [[...], [[...], [...], 0, 0], 1, 1], None, None]，下一个是[[[[...], [...], None, None], [...], 1, 1], [[...], [[...], [...], 1, 1], None, None], 0, 0]

当缓存已满

[[[[...], [...], 0, 0], [...], None, None], [[...], [[...], [...], None, None], 0, 0], 1, 1]

当缓存已满时，将根用作oldroot（... elif full: # Use the old root to store the new key and result. oldroot = root oldroot[KEY] = key oldroot[RESULT] = result # Empty the oldest link and make it the new root. # Keep a reference to the old key and old result to # prevent their ref counts from going to zero during the # update. That will prevent potentially arbitrary object # clean-up code (i.e. __del__) from running while we're # still adjusting the links. root = oldroot[NEXT] oldkey = root[KEY] oldresult = root[RESULT] root[KEY] = root[RESULT] = None # Now update the cache dictionary. del cache[oldkey] # Save the potentially reentrant cache[key] assignment # for last, after the root and links have been put in # a consistent state. cache[key] = oldroot）并更新密钥和结果。
然后将oldroot下一项作为新根（oldroot=root），复制新的根密钥和结果（root=oldroot[NEXT]）。
将新的根密钥和结果设置为None（oldkey = root[KEY] and oldresult = root[RESULT]）。
从缓存（root[KEY] = root[RESULT] = None）中删除旧密钥的项，并将计算结果添加到缓存（del cache[oldkey]）。
对于斐波那契示例，当高速缓存已满且键为cache[key] = oldroot时，根值为2，块末尾的新根为[[[[...], [...], 1, 1], [...], 0, 0], [[...], [[...], [...], 0, 0], 1, 1], None, None]。如您所见，密钥[[[[...], [...], 0, 0], [...], 2, 1], [[...], [[...], [...], 2, 1], 0, 0], None, None]被删除，并由密钥1取代。

获取锁定后，密钥出现在缓存中。

当密钥出现在高速缓存中时，获取锁定后，另一个线程可能已将该值排队。因此，没什么可做的，包装器返回结果。

最后，代码返回结果。在执行缓存未命中部分之前，代码更新缓存将丢失信息并调用make_key函数。

注意：我无法使嵌套列表缩进起作用，因此答案在格式化时可能看起来少了一些。

Answer 3

您可以查看源代码here。

基本上它使用两个数据结构，字典将函数参数映射到其结果，并使用链接列表来跟踪您的函数调用历史记录。

缓存基本上是使用以下方法实现的，这是非常明显的。

cache = {}
cache_get = cache.get
....
make_key = _make_key         # build a key from the function arguments
key = make_key(args, kwds, typed)
result = cache_get(key, sentinel)

更新链表的要点是：

elif full:

    oldroot = root
    oldroot[KEY] = key
    oldroot[RESULT] = result

    # update the linked list to pop out the least recent function call information        
    root = oldroot[NEXT]
    oldkey = root[KEY]
    oldresult = root[RESULT]
    root[KEY] = root[RESULT] = None
    ......

Lru_cache（来自functools）如何工作？

3 个答案: