Question

我需要在python或numpy中有效地实现固定大小的FIFO。而且我可能有不同的此类FIFO，一些用于整数，一些用于字符串，等等。在此FIFO中，我将需要按其索引访问每个元素。

对效率的关注是因为这些FIFO将在预计将连续运行几天的程序的核心中使用，并且预计将有大量数据通过它们。因此，该算法不仅需要节省时间，而且还必须具有存储效率。

现在，在其他语言（如C或Java）中，我将使用循环缓冲区和字符串指针（用于字符串FIFO）有效地实现此目的。这是python / numpy中的一种有效方法，还是有更好的解决方案？

具体来说，以下哪种解决方案最有效：

（1）设置了maxlen值的出队：（垃圾回收对出队效率的影响是什么？）

import collections
l = collections.deque(maxlen=3)
l.append('apple'); l.append('banana'); l.append('carrot'); l.append('kiwi')
print(l, len(l), l[0], l[2])
> deque(['banana', 'carrot', 'kiwi'], maxlen=3) 3 banana kiwi

（2）列出子类解决方案（取自Python, forcing a list to a fixed size）：

class L(list):
    def append(self, item):
        list.append(self, item)
        if len(self) > 3: self[:1]=[]
l2.append('apple'); l2.append('banana'); l2.append('carrot'); l2.append('kiwi')
print(l2, len(l2), l2[2], l2[0])
> ['banana', 'carrot', 'kiwi'] 3 kiwi banana

（3）一个普通的numpy数组。但这限制了字符串的大小，那么如何为此指定最大字符串大小？

a = np.array(['apples', 'foobar', 'cowboy'])
a[2] = 'bananadgege'
print(a)
> ['apples' 'foobar' 'banana']
# now add logic for manipulating circular buffer indices

（4）上面的对象版本，但是python numpy array of arbitrary length strings表示使用对象会取消numpy的好处

a = np.array(['apples', 'foobar', 'cowboy'], dtype=object)
a[2] = 'bananadgege'
print(a)
> ['apples' 'foobar' 'bananadgege']
# now add logic for manipulating circular buffer indices

（5）还是比上面介绍的解决方案更有效的解决方案？

顺便说一句，如果有帮助的话，我的琴弦的最大长度上限。

Answer 1

我会使用NumPy。要指定最大字符串长度，请使用dtype，如下所示：

np.zeros(128, (str, 32)) # 128 strings of up to 32 characters

在python

1 个答案: