我用来随机排序List。我可以用Enumerable做同样的事情吗?

时间:2012-06-11 09:24:11

标签: c# list random ienumerable

这是我在List中随机排序第一个N元素的代码:

int upper = 1;
if (myList.Count > 1)
{
    Random r = new Random();
    upper = Math.Min(maxNumberOfPackets, myList.Count);
    for (int i = 0; i < upper; i++)
    {
        int randInd = r.Next(i, myList.Count);
        var temp = myList[i];
        myList[i] = myList[randInd];
        myList[randInd] = temp;
    }
}
好吧,现在我有“必要性”只使用Enumerable(出于某种原因;所以我不想在Enumerable中转换它。)

据您所知,我可以使用Enumerable进行相同操作吗?我认为痛苦是在X位置访问元素......

我好奇......

6 个答案:

答案 0 :(得分:6)

对于IEnumerable[<T>]来源,您通常应该只假设它们可以被阅读一次,当然,在您消费它之前,您无法可靠地知道它的长度。随机化它通常需要缓冲。你也可以这样做:

var list = source.ToList();

并使用您现有的列表随机代码。当然,如果你想保持输入和输出相似,你可以再次将该列表作为IEnumerable[<T>]返回。

如果你真的想要,你可以只是开始:

static IEnumerable<T> ScrambleStart(this IEnumerable<T> source,
        int numberToScramble)
{
  using(var iter = source.GetEnumerator()) {
    List<T> list = new List<T>();
    while(iter.MoveNext()) {
        list.Add(iter.Current);
        if(list.Count == numberToScramble) break;
    }
    ScrambleList(list); // your existing code
    foreach(var item in list) yield return item;
    while(iter.MoveNext()) {
        yield return iter.Current;
    }
  }
}

答案 1 :(得分:3)

IEnumerable表示“仅向前游标”,它可以返回流式传输的数据(例如,来自数据库)。因此,为了能够订购一个可枚举的(无论它是否是随机的),意味着你需要缓存它,只是因为你需要让所有的值能够确定排序。也就是说,您可以使用Enumerable.OrderBy运算符,但同样,它将为您提供缓存。

var r = new Random();

IEnumerable sorted = myList.OrderBy(i => r.Next());

答案 2 :(得分:2)

您应该使用Reservoir sampling

答案 3 :(得分:1)

您总是需要使用IEnumerable以某种方式随机化它,因此您只需在其上调用.ToList(),然后使用您的排序方法。

考虑IEnumerable可以拥有无​​数个项目的事实。

答案 4 :(得分:1)

以下实现将从原始输入产生(伪)随机序列。 (除了随机伪随机)伪伪是基于猜测序列的长度,当然在大多数情况下这将是错误的。对于长度小于maxJump长度的序列,它将是随机的。如果事实证明,只有一半的元素或者超过maxJump的值,它将被增加以允许更高程度的随机性。

public IEnumerable<T> Randomize(this IEnumerable<T> source,int maxJump = 1000){
    var cache = new List<T>();
    var r = new Random();
    var enumerator = source.GetEnumerator();
    var totalCount = 1;
    while(enumerator.MoveNext()){
       var next = r.Next(0,maxJump);
       if(next < cache.Count){
          //the element we are searching for is in the cache
          yield return cache[next];
          cache.RemoveAt(next);
       } else {
         next = next - cache.Count;
         do{
          totalCount++;
          if(next == 0){
             //we've found the next element so yield
             yield return enumerator.Current;
          } else {
             //not the element we are looking for so cache
             cache.Insert(r.Next(0,cache.Count),enumerator.Current);
          }
          --next;
        }while(next > 0 && enumerator.MoveNext());
        //if we've reached half of the maxJump length
        //increase the max to allow for higher randomness
        if("*totalCount == maxJump){
            maxJump *= 2;
        }
      }
    }
    //Yield the elements in the cache
    //they are already randomized
    foreach(var elem in cache){
          yield return elem;
    }
}

答案 5 :(得分:0)

这是有效的,但对于您想要随机选择相对较少数量的元素的大型序列来说效率非常低(因为它遍历源序列的每个元素)。

/// <summary>Randomly selects items from a sequence.</summary>
/// <typeparam name="T">The type of the items in the sequence.</typeparam>
/// <param name="sequence">The sequence from which to randomly select items.</param>
/// <param name="count">The number of items to randomly select from the sequence.</param>
/// <param name="sequenceLength">The number of items in the sequence among which to randomly select.</param>
/// <param name="rng">The random number generator to use.</param>
/// <returns>A sequence of randomly selected items.</returns>
/// <remarks>This is an O(N) algorithm (N is the sequence length).</remarks>

public static IEnumerable<T> RandomlySelectedItems<T>(IEnumerable<T> sequence, int count, int sequenceLength, System.Random rng)
{
    if (sequence == null)
    {
        throw new ArgumentNullException("sequence");
    }

    if (count < 0 || count > sequenceLength)
    {
        throw new ArgumentOutOfRangeException("count", count, "count must be between 0 and sequenceLength");
    }

    if (rng == null)
    {
        throw new ArgumentNullException("rng");
    }

    int available = sequenceLength;
    int remaining = count;
    var iterator  = sequence.GetEnumerator();

    for (int current = 0; current < sequenceLength; ++current)
    {
        iterator.MoveNext();

        if (rng.NextDouble() < remaining/(double)available)
        {
            yield return iterator.Current;
            --remaining;
        }

        --available;
    }
}