如果我的问题可以使用内置的sorted()函数解决,或者如果我需要自己做的话,我试图解决问题 - 使用cmp的旧学校会相对容易。
我的数据集如下:
x = [ ('business', Set('fleet','address')) ('device', Set('business','model','status','pack')) ('txn', Set('device','business','operator')) ....
排序规则基本上应该是N&的所有价值。 Y,其中Y> N,x [N] [0]不在x [Y] [1]
中虽然我使用的是Python 2.6,但cmp参数仍然可用,我试图让Python 3安全。
那么,这可以使用一些lambda魔法和关键参数来完成吗?
- ==更新== -
感谢Eli&温斯顿!我并不认为使用钥匙会起作用,或者如果我怀疑它是一个非常理想的鞋拔解决方案。
因为我的问题是数据库表依赖项,所以我不得不对Eli的代码进行一些小的补充,以从依赖项列表中删除一个项目(在一个设计良好的数据库中,这不会发生,但是谁生活在那个神奇完美的世界里?)
我的解决方案:
def topological_sort(source):
"""perform topo sort on elements.
:arg source: list of ``(name, set(names of dependancies))`` pairs
:returns: list of names, with dependancies listed first
"""
pending = [(name, set(deps)) for name, deps in source]
emitted = []
while pending:
next_pending = []
next_emitted = []
for entry in pending:
name, deps = entry
deps.difference_update(set((name,)), emitted) # <-- pop self from dep, req Py2.6
if deps:
next_pending.append(entry)
else:
yield name
emitted.append(name) # <-- not required, but preserves original order
next_emitted.append(name)
if not next_emitted:
raise ValueError("cyclic dependancy detected: %s %r" % (name, (next_pending,)))
pending = next_pending
emitted = next_emitted
答案 0 :(得分:16)
您想要的是topological sort。虽然可以使用内置sort()
来实现,但它相当笨拙,最好直接在python中实现拓扑排序。
为什么会变得尴尬?如果您在维基页面上研究这两种算法,它们都依赖于一组运行的“标记节点”,这个概念难以扭曲成sort()
形式可以使用的概念,因为key=xxx
(甚至是cmp=xxx
)最适用于无状态比较函数,特别是因为timsort不保证元素将被检查的顺序。我(非常)确定做使用{{}的任何解决方案1}}最终会为每次调用key / cmp函数冗余计算一些信息,以解决无状态问题。
以下是我一直在使用的alg(用于排序一些javascript库依赖项):
编辑:根据Winston Ewert的解决方案重做工作
sort()
旁注: 可以将def topological_sort(source):
"""perform topo sort on elements.
:arg source: list of ``(name, [list of dependancies])`` pairs
:returns: list of names, with dependancies listed first
"""
pending = [(name, set(deps)) for name, deps in source] # copy deps so we can modify set in-place
emitted = []
while pending:
next_pending = []
next_emitted = []
for entry in pending:
name, deps = entry
deps.difference_update(emitted) # remove deps we emitted last pass
if deps: # still has deps? recheck during next pass
next_pending.append(entry)
else: # no more deps? time to emit
yield name
emitted.append(name) # <-- not required, but helps preserve original ordering
next_emitted.append(name) # remember what we emitted for difference_update() in next pass
if not next_emitted: # all entries have unmet deps, one of two things is wrong...
raise ValueError("cyclic or missing dependancy detected: %r" % (next_pending,))
pending = next_pending
emitted = next_emitted
函数标记为cmp()
,如此python错误跟踪器message中所述。
答案 1 :(得分:6)
我做了类似的拓扑排序:
def topological_sort(items):
provided = set()
while items:
remaining_items = []
emitted = False
for item, dependencies in items:
if dependencies.issubset(provided):
yield item
provided.add(item)
emitted = True
else:
remaining_items.append( (item, dependencies) )
if not emitted:
raise TopologicalSortFailure()
items = remaining_items
我认为它比Eli的版本更直接,我不知道效率。
答案 2 :(得分:5)
查看错误的格式和这个奇怪的Set
类型...(我将它们保存为元组并正确分隔列表项...)...并使用networkx
库来让事情变得方便......
x = [
('business', ('fleet','address')),
('device', ('business','model','status','pack')),
('txn', ('device','business','operator'))
]
import networkx as nx
g = nx.DiGraph()
for key, vals in x:
for val in vals:
g.add_edge(key, val)
print nx.topological_sort(g)
答案 3 :(得分:0)
这是温斯顿的建议,通过文档字符串和微小的调整,将dependencies.issubset(provided)
与provided.issuperset(dependencies)
相反。该更改允许您将每个输入对中的dependencies
作为任意迭代传递,而不是set
。
我的用例涉及dict
,其键是项字符串,每个键的值是该键所依赖的项名称的list
。一旦我确定dict
非空,我就可以将其iteritems()
传递给修改后的算法。
再次感谢温斯顿。
def topological_sort(items):
"""
'items' is an iterable of (item, dependencies) pairs, where 'dependencies'
is an iterable of the same type as 'items'.
If 'items' is a generator rather than a data structure, it should not be
empty. Passing an empty generator for 'items' (zero yields before return)
will cause topological_sort() to raise TopologicalSortFailure.
An empty iterable (e.g. list, tuple, set, ...) produces no items but
raises no exception.
"""
provided = set()
while items:
remaining_items = []
emitted = False
for item, dependencies in items:
if provided.issuperset(dependencies):
yield item
provided.add(item)
emitted = True
else:
remaining_items.append( (item, dependencies) )
if not emitted:
raise TopologicalSortFailure()
items = remaining_items