Python的通用优先级队列

时间:2009-01-02 19:05:55

标签: python queue

我需要在Python代码中使用优先级队列。为了找到有效的东西,我遇到了heapq。它看起来不错,但似乎只为整数指定。我认为它适用于具有比较运算符的任何对象,但它没有指定它需要哪些比较运算符。

此外,heapq似乎是在Python中实现的,所以它并不快。

您是否了解Python中优先级队列的任何快速实现?最理想的情况是,我希望队列是通用的(即适用于具有指定比较运算符的任何对象)。

提前致谢

更新

heapq中重新比较,我可以像查理马丁建议的那样使用(priority, object),或者只为我的对象实现__cmp__

我仍在寻找比heapq更快的东西。

12 个答案:

答案 0 :(得分:38)

您可以使用Queue.PriorityQueue

回想一下,Python不是强类型的,所以你可以保存你喜欢的任何东西:只需要创建一个(priority, thing)的元组,你就可以了。

答案 1 :(得分:17)

我最终为heapq实现了一个包装器,添加了一个用于维护队列元素唯一的dict。结果应该对所有运营商都非常有效:

class PriorityQueueSet(object):

    """
    Combined priority queue and set data structure.

    Acts like a priority queue, except that its items are guaranteed to be
    unique. Provides O(1) membership test, O(log N) insertion and O(log N)
    removal of the smallest item.

    Important: the items of this data structure must be both comparable and
    hashable (i.e. must implement __cmp__ and __hash__). This is true of
    Python's built-in objects, but you should implement those methods if you
    want to use the data structure for custom objects.
    """

    def __init__(self, items=[]):
        """
        Create a new PriorityQueueSet.

        Arguments:
            items (list): An initial item list - it can be unsorted and
                non-unique. The data structure will be created in O(N).
        """
        self.set = dict((item, True) for item in items)
        self.heap = self.set.keys()
        heapq.heapify(self.heap)

    def has_item(self, item):
        """Check if ``item`` exists in the queue."""
        return item in self.set

    def pop_smallest(self):
        """Remove and return the smallest item from the queue."""
        smallest = heapq.heappop(self.heap)
        del self.set[smallest]
        return smallest

    def add(self, item):
        """Add ``item`` to the queue if doesn't already exist."""
        if item not in self.set:
            self.set[item] = True
            heapq.heappush(self.heap, item)

答案 2 :(得分:9)

当使用优先级队列时,reduce-key是许多算法必需的操作(Dijkstra的算法,A *,OPTICS),我想知道为什么Python的内置优先级队列没有支持它。其他答案都没有提供支持此功能的解决方案。

还支持reduce-key操作的优先级队列是Daniel Stutzbach的this实现,对我来说非常适合Python 3.5。

from heapdict import heapdict

hd = heapdict()
hd["two"] = 2
hd["one"] = 1
obj = hd.popitem()
print("object:",obj[0])
print("priority:",obj[1])

# object: one
# priority: 1

答案 3 :(得分:7)

我没有使用它,但你可以尝试PyHeap。它是用C语言编写的,所以希望它对你来说足够快。

你是积极的heapq / PriorityQueue会不够快?可能值得与其中一个开始,然后进行分析,看看它是否真的是你的性能瓶颈。

答案 4 :(得分:7)

您可以将heapq用于非整数元素(元组)

from heapq import *

heap = []
data = [(10,"ten"), (3,"three"), (5,"five"), (7,"seven"), (9, "nine"), (2,"two")]
for item in data:
    heappush(heap, item)
sorted = []
while heap:
    sorted.append(heappop(heap))
print sorted
data.sort()
print data == sorted

答案 5 :(得分:6)

您是否查看了heapq页面上的"Show Source" link?有一个例子,使用一个带有(int,char)元组列表的堆作为优先级队列的时间不到一半。

答案 6 :(得分:2)

这是有效的,适用于字符串或任何类型输入 - :)

pq = []                         # list of entries arranged in a heap
entry_finder = {}               # mapping of tasks to entries
REMOVED = '<removed-task>'      # placeholder for a removed task
counter = itertools.count()     # unique sequence count

def add_task(task, priority=0):
    'Add a new task or update the priority of an existing task'
    if task in entry_finder:
        remove_task(task)
    count = next(counter)
    entry = [priority, count, task]
    entry_finder[task] = entry
    heappush(pq, entry)

def remove_task(task):
    'Mark an existing task as REMOVED.  Raise KeyError if not found.'
    entry = entry_finder.pop(task)
    entry[-1] = REMOVED

def pop_task():
    'Remove and return the lowest priority task. Raise KeyError if empty.'
    while pq:
        priority, count, task = heappop(pq)
        if task is not REMOVED:
            del entry_finder[task]
            return task
    raise KeyError('pop from an empty priority queue')

<强>参考: http://docs.python.org/library/heapq.html

答案 7 :(得分:1)

我在https://pypi.python.org/pypi/fibonacci-heap-mod

有一个优先级队列/斐波纳契堆

它不快(delete-min上的大常数c,即O(c * logn))。但是find-min,insert,reduce-key和merge都是O(1) - IOW,它很懒。

如果它在CPython上太慢,你可能会尝试Pypy,Nuitka甚至CPython + Numba:)

答案 8 :(得分:0)

  

我可以像Charlie Martin建议的那样使用(priority, object),或者只为我的对象实现__cmp__

如果您希望插入的对象按特定规则划分优先级,我发现编写一个接受键函数的PriorityQueue的简单子类非常有用。您不必手动插入(priority, object)元组,处理感觉更自然。

演示所需行为

>>> h = KeyHeap(sum)
>>> h.put([-1,1])
>>> h.put((-1,-2,-3))
>>> h.put({100})
>>> h.put([1,2,3])
>>> h.get()
(-1, -2, -3)
>>> h.get()
[-1, 1]
>>> h.get()
[1, 2, 3]
>>> h.get()
set([100])
>>> h.empty()
True
>>>
>>> k = KeyHeap(len)
>>> k.put('hello')
>>> k.put('stackoverflow')
>>> k.put('!')
>>> k.get()
'!'
>>> k.get()
'hello'
>>> k.get()
'stackoverflow'

Python 2代码

from Queue import PriorityQueue

class KeyHeap(PriorityQueue):
    def __init__(self, key, maxsize=0):            
        PriorityQueue.__init__(self, maxsize)
        self.key = key

    def put(self, x):
        PriorityQueue.put(self, (self.key(x), x))

    def get(self):
        return PriorityQueue.get(self)[1]

Python 3代码

from queue import PriorityQueue

class KeyHeap(PriorityQueue):
    def __init__(self, key, maxsize=0):            
        super().__init__(maxsize)
        self.key = key

    def put(self, x):
        super().put((self.key(x), x))

    def get(self):
        return super().get()[1]

显然,如果你试图插入一个你的密钥函数无法处理的对象,调用put会(并且应该!)引发错误。

答案 9 :(得分:0)

如果您想保持整个列表有序,而不仅仅是最高价值,我已经在多个项目中使用了此代码的一些变体,这是用类似的api替代标准list类的一滴:

import bisect

class OrderedList(list):
    """Keep a list sorted as you append or extend it

    An ordered list, this sorts items from smallest to largest using key, so
    if you want MaxQueue like functionality use negative values: .pop(-1) and
    if you want MinQueue like functionality use positive values: .pop(0)
    """
    def __init__(self, iterable=None, key=None):
        if key:
            self.key = key
        self._keys = []
        super(OrderedList, self).__init__()
        if iterable:
            for x in iterable:
                self.append(x)

    def key(self, x):
        return x

    def append(self, x):
        k = self.key(x)
        # https://docs.python.org/3/library/bisect.html#bisect.bisect_right
        i = bisect.bisect_right(self._keys, k)
        if i is None:
            super(OrderedList, self).append((self.key(x), x))
            self._keys.append(k)
        else:
            super(OrderedList, self).insert(i, (self.key(x), x))
            self._keys.insert(i, k)

    def extend(self, iterable):
        for x in iterable:
            self.append(x)

    def remove(self, x):
        k = self.key(x)
        self._keys.remove(k)
        super(OrderedList, self).remove((k, x))

    def pop(self, i=-1):
        self._keys.pop(i)
        return super(OrderedList, self).pop(i)[-1]

    def clear(self):
        super(OrderedList, self).clear()
        self._keys.clear()

    def __iter__(self):
        for x in super(OrderedList, self).__iter__():
            yield x[-1]

    def __getitem__(self, i):
        return super(OrderedList, self).__getitem__(i)[-1]

    def insert(self, i, x):
        raise NotImplementedError()
    def __setitem__(self, x):
        raise NotImplementedError()
    def reverse(self):
        raise NotImplementedError()
    def sort(self):
        raise NotImplementedError()

默认情况下,它可以处理(priority, value)之类的元组,但是您也可以像这样自定义它:

class Val(object):
    def __init__(self, priority, val):
        self.priority = priority
        self.val = val

h = OrderedList(key=lambda x: x.priority)

h.append(Val(100, "foo"))
h.append(Val(10, "bar"))
h.append(Val(200, "che"))

print(h[0].val) # "bar"
print(h[-1].val) # "che"

答案 10 :(得分:0)

一个简单的工具:

因为PriorityQueue首先较低。

from queue import PriorityQueue


class PriorityQueueWithKey(PriorityQueue):
    def __init__(self, key=None, maxsize=0):
        super().__init__(maxsize)
        self.key = key

    def put(self, item):
        if self.key is None:
            super().put((item, item))
        else:
            super().put((self.key(item), item))

    def get(self):
        return super().get(self.queue)[1]


a = PriorityQueueWithKey(abs)
a.put(-4)
a.put(-3)
print(*a.queue)

答案 11 :(得分:0)

我正在使用queue.PriorityQueue这样在python 3中实现function Search(str) { const onlyBrackets = str.replace(/[a-zA-Z]/g, ""); const left = onlyBrackets.replace(/[)]/g, ""); const right = onlyBrackets.replace(/[(]/g, ""); str = left.length === right.length ? 1 : 0 return str } console.log(Search("(coder)(byte))")) // 0 console.log(Search("(c(oder))b(yte)")) // 1 -

priority queue