使用LINQ进行拓扑排序

时间:2009-12-30 21:30:03

标签: .net linq sorting topological-sort partial-ordering

我有一个partial order relation的项目列表,i。 e,该列表可以被视为partially ordered set。我想以与此question相同的方式对此列表进行排序。正如在那里正确回答的那样,这被称为topological sorting

有一个相当简单的已知算法来解决这个问题。我想要一个类似LINQ的实现。

我已经尝试使用OrderBy扩展方法,但我很确定它无法进行拓扑排序。问题是IComparer<TKey>接口无法表示部分订单。这是因为Compare方法基本上可以返回3种值:,这意味着分别等于小于大于 - 。只有在有办法返回不相关的情况下才能使用有效的解决方案。

从我偏见的角度来看,我正在寻找的答案可能是由IPartialOrderComparer<T>界面和这样的扩展方法组成的:

public static IOrderedEnumerable<TSource> OrderBy<TSource, TKey>(
    this IEnumerable<TSource> source,
    Func<TSource, TKey> keySelector,
    IPartialOrderComparer<TKey> comparer
);

如何实施? IPartialOrderComparer<T>界面看起来如何?你会推荐一种不同的方法吗?我很想看到它。也许有一种更好的方式来表示偏序,我不知道。

6 个答案:

答案 0 :(得分:15)

我建议使用相同的IComparer接口,但编写扩展方法以便将0解释为不相关。在部分排序中,如果元素a和b相等,则它们的顺序无关紧要,如果它们不相关则相似 - 您只需要根据它们定义关系的元素对它们进行排序。

这是一个对偶数和奇数整数进行部分排序的例子:

namespace PartialOrdering
{
    public static class Enumerable
    {
        public static IEnumerable<TSource> PartialOrderBy<TSource, TKey>(this IEnumerable<TSource> source, Func<TSource, TKey> keySelector, IComparer<TKey> comparer)
        {
            List<TSource> list = new List<TSource>(source);
            while (list.Count > 0)
            {
                TSource minimum = default(TSource);
                TKey minimumKey = default(TKey);
                foreach (TSource s in list)
                {
                    TKey k = keySelector(s);
                    minimum = s;
                    minimumKey = k;
                    break;
                }
                foreach (TSource s in list)
                {
                    TKey k = keySelector(s);
                    if (comparer.Compare(k, minimumKey) < 0)
                    {
                        minimum = s;
                        minimumKey = k;
                    }
                }
                yield return minimum;
                list.Remove(minimum);
            }
            yield break;
        }

    }
    public class EvenOddPartialOrdering : IComparer<int>
    {
        public int Compare(int a, int b)
        {
            if (a % 2 != b % 2)
                return 0;
            else if (a < b)
                return -1;
            else if (a > b)
                return 1;
            else return 0; //equal
        }
    }
    class Program
    {
        static void Main(string[] args)
        {
            IEnumerable<Int32> integers = new List<int> { 8, 4, 5, 7, 10, 3 };
            integers = integers.PartialOrderBy<Int32, Int32>(new Func<Int32, Int32>(delegate(int i) { return i; }), new EvenOddPartialOrdering());
        }
    }
}

结果:4,8,3,5,7,10

答案 1 :(得分:8)

这是我tehMick's answer的优化和翻新版本。

我所做的一项更改是替换为逻辑列表生成的值的真实列表。为此,我有两个相同大小的大小数组。一个具有所有值,其他包含标志,告知是否已经产生了每个值。这样,我就可以避免重新调整真实List<Key>的成本。

另一个变化是我在迭代开始时只读取了一次所有键。由于我现在不记得的原因(也许这只是我的直觉)我不喜欢多次调用keySelector函数的想法。

最后一次接触是参数验证,以及使用隐式密钥比较器的额外重载。我希望代码足够可读。看看吧。

public static IEnumerable<TSource> PartialOrderBy<TSource, TKey>(
    this IEnumerable<TSource> source,
    Func<TSource, TKey> keySelector)
{
    return PartialOrderBy(source, keySelector, null);
}

public static IEnumerable<TSource> PartialOrderBy<TSource, TKey>(
    this IEnumerable<TSource> source,
    Func<TSource, TKey> keySelector,
    IComparer<TKey> comparer)
{
    if (source == null) throw new ArgumentNullException("source");
    if (keySelector == null) throw new ArgumentNullException("keySelector");
    if (comparer == null) comparer = (IComparer<TKey>)Comparer<TKey>.Default;

    return PartialOrderByIterator(source, keySelector, comparer);
}

private static IEnumerable<TSource> PartialOrderByIterator<TSource, TKey>(
    IEnumerable<TSource> source,
    Func<TSource, TKey> keySelector,
    IComparer<TKey> comparer)
{
    var values = source.ToArray();
    var keys = values.Select(keySelector).ToArray();
    int count = values.Length;
    var notYieldedIndexes = System.Linq.Enumerable.Range(0, count).ToArray();
    int valuesToGo = count;

    while (valuesToGo > 0)
    {
        //Start with first value not yielded yet
        int minIndex = notYieldedIndexes.First( i => i >= 0);

        //Find minimum value amongst the values not yielded yet
        for (int i=0; i<count; i++)
        if (notYieldedIndexes[i] >= 0)
        if (comparer.Compare(keys[i], keys[minIndex]) < 0) {
            minIndex = i;
        }

        //Yield minimum value and mark it as yielded
        yield return values[minIndex];
        notYieldedIndexes[minIndex] = -1;
        valuesToGo--;
    }
}

答案 2 :(得分:2)

嗯,我不确定这种处理方式是最好的方法,但我可能错了。

处理拓扑排序的典型方法是使用图形,并且对于每次迭代,删除所有没有入站连接的节点,同时删除从这些节点出站的所有连接。删除的节点是迭代的输出。重复,直到无法删除任何节点。

但是,为了首先使用您的方法获取这些连接,您需要:

  1. 一种方法(你的比较者)可以说“之前的那个”,但也“没有这两个信息”
  2. 迭代所有组合,为比较器返回排序的所有组合创建连接。
  3. 换句话说,该方法可能定义如下:

    public Int32? Compare(TKey a, TKey b) { ... }
    

    然后返回null,当它没有两个键的确定答案时。

    我正在考虑的问题是“迭代所有组合”部分。也许有更好的方法来解决这个问题,但我没有看到它。

答案 3 :(得分:1)

我相信Lasse V. Karlsen's answer在正确的轨道上,但我不喜欢隐藏Compare方法(或者不是从IComparable<T>扩展的单独接口)。< / p>

相反,我宁愿看到这样的东西:

public interface IPartialOrderComparer<T> : IComparer<T>
{
    int? InstancesAreComparable(T x, T y);
}

这样,您仍然可以使用IComparer<T>的实现,可以在需要IComparer<T>的其他地方使用。

但是,它还要求您以下列方式指示T的实例与返回值的关系(类似于IComparable<T>):

  • null - 实例无法相互比较。
  • 0 - 实例相互比较。
  •   

    0 - x是可比较的密钥,但y不是。

  • &LT; 0 - y是可比较的密钥,但x不是。

当然,在将此实现传递给任何需要IComparable<T>的任何内容时,您都不会得到部分排序(并且应该注意到Lasse V. Karlsen的答案也没有解决这个问题)你需要的不能用一个简单的Compare方法来表示,它接受两个T实例并返回一个int。

要完成解决方案,您必须提供自定义OrderBy(以及ThenBy,OrderByDescending和ThenByDescending)扩展方法,该方法将接受新的实例参数(如您所示)。实现看起来像这样:

public static IOrderedEnumerable<TSource> OrderBy<TSource, TKey>(      
    this IEnumerable<TSource> source,      
    Func<TSource, TKey> keySelector,      
    IPartialOrderComparer<TKey> comparer)
{
    return Enumerable.OrderBy(source, keySelector,
        new PartialOrderComparer(comparer);
}

internal class PartialOrderComparer<T> : IComparer<T>
{
    internal PartialOrderComparer(IPartialOrderComparer<T> 
        partialOrderComparer)
    {
        this.partialOrderComparer = partialOrderComparer;
    }

    private readonly IPartialOrderComparer<T> partialOrderComparer;

    public int Compare(T x, T y)
    {
        // See if the items are comparable.
        int? comparable = partialOrderComparable.
            InstancesAreComparable(x, y);

        // If they are not comparable (null), then return
        // 0, they are equal and it doesn't matter
        // what order they are returned in.
        // Remember that this only to determine the
        // values in relation to each other, so it's
        // ok to say they are equal.
        if (comparable == null) return 0;

        // If the value is 0, they are comparable, return
        // the result of that.
        if (comparable.Value == 0) return partialOrderComparer.Compare(x, y);

        // One or the other is uncomparable.
        // Return the negative of the value.
        // If comparable is negative, then y is comparable
        // and x is not.  Therefore, x should be greater than y (think
        // of it in terms of coming later in the list after
        // the ordered elements).
        return -comparable.Value;            
    }
}

答案 4 :(得分:1)

定义部分订单关系的界面:

interface IPartialComparer<T> {
    int? Compare(T x, T y);
}

Compare如果-1x < y 0x = y 1 y < xnull,则应返回x如果ye_1, e_2, e_3, ..., e_n无法比较。

我们的目标是返回符合枚举的部分顺序中元素的排序。也就是说,我们在偏序中搜索元素序列i <= j,以便e_ie_je_i <= e_j相比,class TopologicalSorter { class DepthFirstSearch<TElement, TKey> { readonly IEnumerable<TElement> _elements; readonly Func<TElement, TKey> _selector; readonly IPartialComparer<TKey> _comparer; HashSet<TElement> _visited; Dictionary<TElement, TKey> _keys; List<TElement> _sorted; public DepthFirstSearch( IEnumerable<TElement> elements, Func<TElement, TKey> selector, IPartialComparer<TKey> comparer ) { _elements = elements; _selector = selector; _comparer = comparer; var referenceComparer = new ReferenceEqualityComparer<TElement>(); _visited = new HashSet<TElement>(referenceComparer); _keys = elements.ToDictionary( e => e, e => _selector(e), referenceComparer ); _sorted = new List<TElement>(); } public IEnumerable<TElement> VisitAll() { foreach (var element in _elements) { Visit(element); } return _sorted; } void Visit(TElement element) { if (!_visited.Contains(element)) { _visited.Add(element); var predecessors = _elements.Where( e => _comparer.Compare(_keys[e], _keys[element]) < 0 ); foreach (var e in predecessors) { Visit(e); } _sorted.Add(element); } } } public IEnumerable<TElement> ToplogicalSort<TElement, TKey>( IEnumerable<TElement> elements, Func<TElement, TKey> selector, IPartialComparer<TKey> comparer ) { var search = new DepthFirstSearch<TElement, TKey>( elements, selector, comparer ); return search.VisitAll(); } } 。我将使用深度优先搜索来完成此操作。

使用深度优先搜索实现拓扑排序的类:

class ReferenceEqualityComparer<T> : IEqualityComparer<T> {
    public bool Equals(T x, T y) {
        return Object.ReferenceEquals(x, y);
    }

    public int GetHashCode(T obj) {
        return obj.GetHashCode();
    }
}

在进行深度优先搜索时将节点标记为已访问时需要帮助程序类:

IOrderedEnumerable

我没有声称这是算法的最佳实现,但我相信这是一个正确的实现。此外,我没有按照您的要求返回e,但是一旦我们处于这一点,这很容易做到。

如果我们已经添加了{的所有前导,那么算法通过对元素_sorted添加元素e的元素进行深度优先搜索,然后运行算法中的e。 {1}}已添加到订购中。因此,对于每个元素e,如果我们尚未访问它,请访问其前任,然后添加public void Visit(TElement element) { // if we haven't already visited the element if (!_visited.Contains(element)) { // mark it as visited _visited.Add(element); var predecessors = _elements.Where( e => _comparer.Compare(_keys[e], _keys[element]) < 0 ); // visit its predecessors foreach (var e in predecessors) { Visit(e); } // add it to the ordering // at this point we are certain that // its predecessors are already in the ordering _sorted.Add(element); } } 。因此,这是算法的核心:

{1, 2, 3}

例如,如果X < YX的子集,请考虑在Y的子集上定义的部分排序public class SetComparer : IPartialComparer<HashSet<int>> { public int? Compare(HashSet<int> x, HashSet<int> y) { bool xSubsety = x.All(i => y.Contains(i)); bool ySubsetx = y.All(i => x.Contains(i)); if (xSubsety) { if (ySubsetx) { return 0; } return -1; } if (ySubsetx) { return 1; } return null; } } 。我按如下方式实现:

sets

然后将{1, 2, 3}定义为List<HashSet<int>> sets = new List<HashSet<int>>() { new HashSet<int>(new List<int>() {}), new HashSet<int>(new List<int>() { 1, 2, 3 }), new HashSet<int>(new List<int>() { 2 }), new HashSet<int>(new List<int>() { 2, 3}), new HashSet<int>(new List<int>() { 3 }), new HashSet<int>(new List<int>() { 1, 3 }), new HashSet<int>(new List<int>() { 1, 2 }), new HashSet<int>(new List<int>() { 1 }) }; TopologicalSorter s = new TopologicalSorter(); var sorted = s.ToplogicalSort(sets, set => set, new SetComparer());

的子集列表
{}, {2}, {3}, {2, 3}, {1}, {1, 3}, {1, 2}, {1, 2, 3}

这导致订购:

{{1}}

尊重部分顺序。

这很有趣。感谢。

答案 5 :(得分:0)

非常感谢所有人,从Eric Mickelsen的回答开始,我提出了我的版本,因为我更喜欢使用空值来表示没有关系,因为Lasse V. Karlsen说。

void combinePoints(vector<Point>& allPoints, vector< vector<Cavity> >& allPointBlobs, vector<Point>& tempBlob)
{
     float check;
     if(allPoints.size() != 0) //if statement to stop recursion once all the points from "allPoints" vector is checked and removed
     {
         for(int i = 0; i < allPoints.size(); i++) //3d distance formula checking first point with all other points
         {
             check = sqrt((allPoints[0].getX() - allPoints[i].getX()) * (allPoints[0].getX() - allPoints[i].getX()) + 
             (allPoints[0].getY() - allPoints[i].getY()) * (allPoints[0].getY() - allPoints[i].getY()) + 
             (allPoints[0].getZ() - allPoints[i].getZ()) * (allPoints[0].getZ() - allPoints[i].getZ())); 
             if ((check <= 1.000) && (check != 0 )) //once a point is found that is 1 distance or less, it is added to tempBlob vector and removed from allPoints
                     {
                               tempBlob.push_back(allPoints[0]);
                               tempBlob.push_back(allPoints[i]);
                               allPoints.erase(allPoints.begin() + i);
                               allPoints.erase(connollyPoints.begin());
                               break;
                     }
         }
         if(check > 1.000) //However, if no points are nearby, then tempBlob is finished finding all nearby points and is added to a vector and cleared so it can start finding another blob.
         {
                  allPointBlobs.push_back(tempBlob);
                  tempBlob.clear();
                  cout << "Blob Done" << endl;
                  combinePoints(allPoints, allPointBlobs, tempBlob);
         }
         else
         {
         combinePoints2(allPoints, allPointBlobs, tempBlob);
         }
     }
}
void combinePoints2(vector<Point>& allPoints, vector< vector<Point> >& allPointBlobs, vector<Point>& tempBlob) //combinePoints2 is almost the same as the first one, except I changed the first part where it doesnt have to initiate a vector with first two points. This function will then check all points in the temporary blob against all other points and find ones that are 1 distance or less
{
     cout << tempBlob.size() << endl; //I use this just to check if function is working
     float check = 0;
     if(allPoints.size() != 0)
     {
         for(int j = 0; j < tempBlob.size(); j++)
         {
             for(int k = 0; k < allPoints.size(); k++)
             {
                 check = sqrt((tempBlob[j].getX() - allPoints[k].getX()) * (tempBlob[j].getX() - allPoints[k].getX()) + 
                 (tempBlob[j].getY() - allPoints[k].getY()) * (tempBlob[j].getY() - allPoints[k].getY()) + 
                 (tempBlob[j].getZ() - allPoints[k].getZ()) * (tempBlob[j].getZ() - allPoints[k].getZ())); 
                 if ((check <= 1.000) && (check != 0 ))
                     {  
                               tempBlob.push_back(allPoints[k]);  
                              allPoints.erase(allPoints.begin() + k);
                               break;    
                     }
             }
             if ((check <= 1.000) && (check != 0 ))
             {
                 break;
             }
         }
         if(check > 1.000)
         {
                  allPointBlobs.push_back(tempBlob);
                  tempBlob.clear();
                  cout << "Blob Done" << endl;
                  combinePoints(allPoints, allPointBlobs, tempBlob);
         }
         else
         {
         combinePoints2(allPoints, allPointBlobs, tempBlob);
         }
     }
}

然后我有以下比较器接口

public static IEnumerable<TSource> PartialOrderBy<TSource>(
        this IEnumerable<TSource> source,            
        IPartialEqualityComparer<TSource> comparer)
    {
        if (source == null) throw new ArgumentNullException("source");
        if (comparer == null) throw new ArgumentNullException("comparer");


        var set = new HashSet<TSource>(source);
        while (!set.IsEmpty())
        {
            TSource minimum = set.First();                

            foreach (TSource s in set)
            {                    
                var comparison = comparer.Compare(s, minimum);
                if (!comparison.HasValue) continue;
                if (comparison.Value <= 0)
                {
                    minimum = s;                        
                }
            }
            yield return minimum;
            set.Remove(minimum);
        }
    }

public static IEnumerable<TSource> PartialOrderBy<TSource>(
       this IEnumerable<TSource> source,
       Func<TSource, TSource, int?> comparer)
    {
        return PartialOrderBy(source, new PartialEqualityComparer<TSource>(comparer));
    }

和这个助手类

public interface IPartialEqualityComparer<T>
{
    int? Compare(T x, T y);
}

允许美化一点使用,所以我的测试看起来像下面的

internal class PartialEqualityComparer<TSource> : IPartialEqualityComparer<TSource>
{
    private Func<TSource, TSource, int?> comparer;

    public PartialEqualityComparer(Func<TSource, TSource, int?> comparer)
    {
        this.comparer = comparer;
    }

    public int? Compare(TSource x, TSource y)
    {
        return comparer(x,y);
    }
}