根据百分比从IEnumerable中选择行

时间:2012-05-08 19:47:55

标签: c#

我目前有一个每5分钟运行一次的Windows服务。代码从数据库中选择行进行处理。有一个上限(允许选择最大行数),因此选择的行数可以是0-100。

我希望根据随机百分比选择对这些行进行一些处理。

  • 任务1 25%
  • 任务2 50%
  • 任务3 100%

为简单起见,让我们假设服务选择100行,然后25个随机选择的行将运行任务1,50个随机选择的行将运行任务2,并且所有行都将运行任务3。

我目前看到的代码如下:

var rows = repository.GetRows(100);

foreach(var row in rows)
{
    task1.Run(row);
    task2.Run(row);
    task3.Run(row);
}

这将在所有行上运行所有三个任务。我如何只为每项任务选择分配的百分比?

6 个答案:

答案 0 :(得分:2)

可能有点质朴...

var rows = repository.GetRows(100);

rows.OrderBy(Guid.NewGuid()).Take(25).ToList().ForEach(m => task1.Run(m));
rows.OrderBy(Guid.NewGuid()).Take(50).ToList().ForEach(m => task2.Run(m));
rows.ToList().ForEach(m => task3.Run(m));

答案 1 :(得分:2)

你可以定义一个Shuffle()扩展方法来执行Fisher-Yates-Durstenfeld shuffle(它以线性时间执行而不是OrderBy的NlogN时间):

public static IEnumerable<T> Shuffle<T>(this IEnumerable<T> input)
{
    var buffer = input.ToArray();
    //Math.Random is OK for "everyday" randomness;
    //you should use RNGCryptoServiceProvider if you need 
    //cryptographically-strong randomness
    var rng = new Math.Random();

    //as the loop proceeds, the element to output will be randomly chosen
    //from the elements at index i or above, which will then be swapped with i;
    //the yield return gives us each shuffled value as it is chosen, and
    //allows the shuffling to happen "lazily".
    for (int i = 0; i < buffer.Length; i++)
    {
        int j = rng.Next(i, buffer.Length);
        yield return buffer[j];
        //if we cared about the elements in the buffer this would be a swap,
        //but we don't, so...    
        buffer[j] = buffer[i];
    }
}

//simple extension method to provide List.ForEach()-like functionality
//on any collection or IEnumerable.
public static void ForEach(this IEnumerable<T> collection, Action<T> action)
{
    foreach(var element in collection) action(element);
}

//Usage - pretty much the same as Raphael's, 
//but now you don't have to convert to a List to use ForEach:
rows.Shuffle().Take(25).ForEach(m => task1.Run(m));
rows.Shuffle().Take(50).ForEach(m => task2.Run(m));
rows.ForEach(m => task3.Run(m));

答案 2 :(得分:1)

您可以使用以下内容获得随机子集:

task1.Run(rows);
task2.Run(rows.OrderBy(x => Guid.NewGuid()).Take(25));
task2.Run(rows.OrderBy(x => Guid.NewGuid()).Take(50))

答案 3 :(得分:1)

对于这种情况,您可以使用Knuth的随机抽样方法(从n中选择m项):

var rows = repository.GetRows(100);
int[] maxTake = new[] {25,50,100};
int remaining = rows.Length;
Random rand = new Random();

for (int i = 0; i < rows.Length; i++)
{
    var num = rand.Next() % remaining;
    if (num < maxTake[0])
    {
        task1.Run(rows[i]);
        maxTake[0]--;
    }
    if (num < maxTake[1])
    {
        task2.Run(rows[i]);
        maxTake[1]--;
    }
    if (num < maxTake[2])
    {
        task3.Run(rows[i]);
        maxTake[2]--;
    }
    remaining--;
}

答案 4 :(得分:0)

您可以使用Random实例为每行生成一个介于0.0和1.0之间的随机值。

大约25%的行的生成值小于0.25;大约50%的行的生成值小于0.5。

var rows = repository.GetRows(100);

Random random = new Random();

task1.Run(rows.Where(_ => random.NextDouble() <= 0.25));
task2.Run(rows.Where(_ => random.NextDouble() <= 0.5));
task3.Run(row);

如果您想保证您将获得行集合的25%和50%(向下舍入),请使用:

Random random = new Random();

var rows = repository.GetRows(100);
var rowsRandomized = rows.OrderBy(_ => random.NextDouble());

task2.Run(rowsRandomized.Take((int)(0.25 * rows.Count())));
task2.Run(rowsRandomized.Take((int)(0.5 * rows.Count())));
task3.Run(rowsRandomized);

答案 5 :(得分:0)

获得25个随机唯一数字

 Random rand=new Random()

 int[] task1nums = new int[25];
 for (int i=0;i<25;i++);
 {
    int r=rand.Next(100);

    while (task1nums.Contains(r))
    {
        r=rand.Next(100);
    }
    task1nums[i]=r;
}

获得50个随机唯一数字

 int[] task2nums = new int[50];
 for (int i=0;i<50;i++);
 {
    int r=rand.Next(100);

    while (task2nums.Contains(r))
    {
        r=rand.Next(100);
    }
    task2nums[i]=r;
}

所以现在你有25个随机数和50个随机数

var rows = repository.GetRows(100);
int counter=0
foreach(var row in rows)
{
    if (task1nums.Contains(counter))
    task1.Run(row);
    if (task2nums.Contains(counter))
    task2.Run(row);


    task3.Run(row);

    counter++;
}