对列表列表进行逻辑排序(部分排序的集->拓扑排序)

时间:2018-07-19 18:09:22

标签: python

修改 可接受的答案适用于满足strict partially ordered set要求的集合,因此可以构造directed acyclic graph

  • 自反性{"xid":1190,"sn":1,"kind":"update","data":{"id":401,"name":"Update AA","age":20}} {"xid":1190,"sn":2,"kind":"update","data":{"id":401,"name":"Update BB","age":20}} :列表不包含not a < a之类的项目
  • 传递性['a','a']:该列表不包含类似if a < b and b < c then a < c的项目
  • 不对称['a','b'],['b','c'],['c','a']:列表不包含诸如if a < b then not b < a
  • 之类的项目

使用此列表列表:
['a','b'],['b','a']
并将其展平为一个列表,根据值的邻居进行排序:

  • 第一个子列表告诉您b在c之前出现
  • 然后是c之前的
  • b在a
  • 之前
  • 最后是c之后的d

子列表之间的整体顺序是一致的,这意味着不会有以下子列表:[['b', 'c'], ['a', 'c'], ['b', 'a'], ['a', 'c', 'd'], ]。 因此,结果应为:['b','c'],['c','b']

(很长一段时间后)我想到了这个丑陋的烂摊子:

['b', 'a', 'c', 'd']

它看起来像完全按照预期的方式运行,但是它远非高效(或者说优雅)。
有没有可以像这样排序的算法?
还是有一些pythonic技巧可以使它更有效?


2 个答案:

答案 0 :(得分:4)

您可以创建一个查找功能,该功能确定应将特定值放在另一个值之前还是之后:

d = [['b', 'c'], ['a', 'c'], ['b', 'a'], ['a', 'c', 'd']]
flattened = {i for b in d for i in b}
def _lookup(a, b):
  _loc = [i for i in d if a in i and b in i]
  return True if not _loc else _loc[0].index(a) < _loc[0].index(b)

class T: 
  def __init__(self, _val):
    self.v = _val
  def __lt__(self, _n):
    return _lookup(self.v, _n.v)

final_result = [i.v for i in sorted(map(T, flattened))]

输出:

['b', 'a', 'c', 'd']

使用[['b', 'c'], ['a', 'c'], ['b', 'a'], ['a', 'c', 'd'], ['a', 'e']]

['b', 'a', 'c', 'e', 'd']

答案 1 :(得分:1)

nosklo和Ajax1234的现有答案都在输入[[1, 3], [3, 5], [5, 2], [2, 4]]时失败。输入[[1, 4], [2, 3], [3, 4], [1, 2]]时问题fails中的尝试。

正确的方法如BowlingHawk95所述:对由您的输入列表引起的有向无环图执行topological sort

我们可以实现自己的拓扑排序,但是让现有的图形库处理它是更安全的。例如,NetworkX

from itertools import chain, tee

import networkx
import networkx.algorithms

# pairwise recipe from the itertools docs.
def pairwise(iterable):
    "s -> (s0,s1), (s1,s2), (s2, s3), ..."
    a, b = tee(iterable)
    next(b, None)
    return zip(a, b)

def merge_ordering(sublists):
    # Make an iterator of graph edges for the new graph. Some edges may be repeated.
    # That's fine. NetworkX will ignore duplicates.
    edges = chain.from_iterable(map(pairwise, sublists))

    graph = networkx.DiGraph(edges)
    return list(networkx.algorithms.topological_sort(graph))

这会为问题中的输入,其他答案失败的[[1, 3], [3, 5], [5, 2], [2, 4]]情况和您的尝试失败的[[1, 4], [2, 3], [3, 4], [1, 2]]情况产生正确的输出:

>>> merge_ordering([[1, 3], [3, 5], [5, 2], [2, 4]])
[1, 3, 5, 2, 4]
>>> merge_ordering([['b', 'c'], ['a', 'c'], ['b', 'a'], ['a', 'c', 'd']])
['b', 'a', 'c', 'd']
>>> merge_ordering([[1, 4], [2, 3], [3, 4], [1, 2]])
[1, 2, 3, 4]

如果输入列表不能唯一确定拼合的形式,我们还可以编写一个引发错误的版本:

def merge_ordering_unique(sublists):
    # Make an iterator of graph edges for the new graph. Some edges may be repeated.
    # That's fine. NetworkX will ignore duplicates.
    edges = chain.from_iterable(map(pairwise, sublists))

    graph = networkx.DiGraph(edges)
    merged = list(networkx.algorithms.topological_sort(graph))

    for a, b in pairwise(merged):
        if not graph.has_edge(a, b):
            raise ValueError('Input has multiple possible topological orderings.')

    return merged

演示:

>>> merge_ordering_unique([['b', 'c'], ['a', 'c'], ['b', 'a'], ['a', 'c', 'd']])
['b', 'a', 'c', 'd']
>>> merge_ordering_unique([[1, 3, 4], [1, 2, 4]])
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<string>", line 11, in merge_ordering_unique
ValueError: Input has multiple possible topological orderings.