从给定概率的列表中选择项目的最有效方法

时间:2015-08-21 13:57:36

标签: c# linq lambda

我试图解决一个相当简单的问题,其中我有一个项目列表,其中找到项目的概率也取决于项目本身(我想在大海捞针中找到铲子比找到针头更容易)。

我想要一种随机返回其中一项的方法,同时考虑找到每一项的可能性。

所以这些项目可以这样列出:

A - 100
B - 50
C - 10

数字表示查找项目的容易程度,更高的值可以更容易找到。

运行以下方法10000次导致找到这些数量的项目:

A - 6249  (100 / 160 = 0,625)
B - 3139  (50  / 160 = 0,3125)
C - 612   (10  / 160 = 0,0625)

这几乎证明了以下代码的有效性。

所以现在我的问题是,考虑到列表本身可以包含数千个项目,如何改进呢?现在,该方法将在最坏的情况下在列表中的每个项目上运行至少一次,即O(n)。

是否可以将其写入LINQ / LAMBDA语句,以便SQL服务器可以处理它而不是将所有项目提升到C#?

public long GetRandomItem()
{
    var allItems = _db.AllItems
        .Where(x => x.CanBeFound == true)
        .OrderByDescending(x => x.Rarity)
        .Select(x => new
        {
            x.Id,       // id of item
            x.Rarity,   // rarity between 1 and 100
        }).ToList();

    int totalRarity = allItems.Sum(x => x.Rarity);
    var random = new Random(DateTime.Now.Millisecond);

    var randomNumber = random.NextDouble() * totalRarity;

    double totalSoFar = 0;
    long chosenId = -1;
    foreach (var i in allItems)
    {
        totalSoFar += i.Rarity;
        if (totalSoFar > randomNumber)
        {
            chosenId = i.Id;
            break; 
        }
    }

    return chosenId;
}

-----编辑------

将LINQ重新编译为仅对数据库进行两次查询的版本,并且不需要循环。不完全确定这是否更好,因为这会强制SQL执行更多的连接和数据选择。

public long GetRandomGamePiece()
{
    int totalRarity = _db.GamePieceTemplates.Sum(x => x.Rarity);
    var randomNumber = 1 + Math.Round(_Random.NextDouble() * (totalRarity - 1)); 

    var randomItem = _db.GamePieceTemplates
        .Where(x => x.CanBeFound == true)
        .OrderBy(x => x.Id)
        .Select((x) => new
        {
            x.Id,       // id of item
            x.Rarity,   // rarity between 1 and 100

            // +1 so that it dosent overlap previous level
            MinRarity = _db.GamePieceTemplates.Where(y => y.Id <= x.Id).Sum(y => y.Rarity) - x.Rarity + 1, 
            MaxRarity = _db.GamePieceTemplates.Where(y => y.Id <= x.Id).Sum(y => y.Rarity)
        })
        .Single(x => x.MinRarity <= randomNumber && x.MaxRarity >= randomNumber);

    long chosenId = -1;
    return  randomItem.Id;
}

这会转换为此TSQL:

SELECT TOP (2) 
    [Project6].[Rarity] AS [Rarity], 
    [Project6].[Id] AS [Id], 
    [Project6].[C1] AS [C1], 
    [Project6].[C2] AS [C2]
    FROM ( SELECT 
        [Project5].[Id] AS [Id], 
        [Project5].[Rarity] AS [Rarity], 
        ([Project5].[C1] - [Project5].[Rarity]) + 1 AS [C1], 
        [Project5].[C2] AS [C2]
        FROM ( SELECT 
            [Project4].[Id] AS [Id], 
            [Project4].[Rarity] AS [Rarity], 
            [Project4].[C1] AS [C1], 
            (SELECT 
                SUM([Extent5].[Rarity]) AS [A1]
                FROM [dbo].[GamePieceTemplates] AS [Extent5]
                WHERE [Extent5].[Id] <= [Project4].[Id]) AS [C2]
            FROM ( SELECT 
                [Project3].[Id] AS [Id], 
                [Project3].[Rarity] AS [Rarity], 
                (SELECT 
                    SUM([Extent4].[Rarity]) AS [A1]
                    FROM [dbo].[GamePieceTemplates] AS [Extent4]
                    WHERE [Extent4].[Id] <= [Project3].[Id]) AS [C1]
                FROM ( SELECT 
                    [Project2].[Id] AS [Id], 
                    [Project2].[Rarity] AS [Rarity]
                    FROM ( SELECT 
                        [Project1].[Id] AS [Id], 
                        [Project1].[Rarity] AS [Rarity], 
                        [Project1].[C1] AS [C1], 
                        (SELECT 
                            SUM([Extent3].[Rarity]) AS [A1]
                            FROM [dbo].[GamePieceTemplates] AS [Extent3]
                            WHERE [Extent3].[Id] <= [Project1].[Id]) AS [C2]
                        FROM ( SELECT 
                            [Extent1].[Id] AS [Id], 
                            [Extent1].[Rarity] AS [Rarity], 
                            (SELECT 
                                SUM([Extent2].[Rarity]) AS [A1]
                                FROM [dbo].[GamePieceTemplates] AS [Extent2]
                                WHERE [Extent2].[Id] <= [Extent1].[Id]) AS [C1]
                            FROM [dbo].[GamePieceTemplates] AS [Extent1]
                            WHERE 1 = [Extent1].[CanBeFound]
                        )  AS [Project1]
                    )  AS [Project2]
                    WHERE ( CAST( ([Project2].[C1] - [Project2].[Rarity]) + 1 AS float) <= 130) AND ( CAST( [Project2].[C2] AS float) >= 130)
                )  AS [Project3]
            )  AS [Project4]
        )  AS [Project5]
    )  AS [Project6]
    ORDER BY [Project6].[Id] ASC

3 个答案:

答案 0 :(得分:1)

我这样做的方法是根据选项总数进行简单的计算。不需要循环 - 随机值本身决定了结果。

伪代码将是:

int maxValueA = 100;
int maxValueB = 50;
int maxValueC = 10;
int total = maxValueA + maxValueB + maxValueC;

int x = random number between zero and total;

if (x <= maxValueA) return A;
else if (x <= maxValueA + maxValueB) return B;
else return C;

因此,如果您有一个有序的结果列表,那么您真正需要做的就是选择结果集中与随机数对应的项目。

实际使用此方法是根据ID发生的概率(再次,伪代码)来填充数组:

int[] IDsList = { A, A, A, A, B, B, C }; // ID's populated based on % chance being chosen

x = random int between 0 and IDsList.Count;

return IDsList[x];

答案 1 :(得分:1)

如果可以向数据添加新列,则可以在SQL中执行此操作。这个新专栏将包括到目前为止“可能性”的总和。按列排序,您会看到样本值如下所示:

Id AccumP
A  100
B  150
C  160

如果您维护该属性,则可以通过以下方式找到加权随机项:

  1. 查找按AccumP排序的最后一项。
  2. 选择介于0和最后一项AccumP之间的随机数。
  3. 找到AccumP值大于随机AccumP但最接近它的项目。这是您的加权随机结果。
  4. 如果你索引AccumP,这应该很快!

答案 2 :(得分:0)

另一种方法 - 创建一个列表,每个数字都按重复次数复制。

10出现10次,50出现50次 - 然后获得1和列表项数量之间的随机数,这给出了一个索引,然后用它来获取该索引处的列表项。

void Main()
{
    var items = new int [] {100,50,10};
    var dict = new Dictionary<int,int>();
    var test = Enumerable.Range(1,10000);
    foreach (var t in test)
    {
        var result = SelectItem(items);
        if (!dict.ContainsKey(result))
        {
            dict.Add(result,0);
        }
        dict[result]++;
    }

    foreach (var d in dict.Keys)
    {
        Console.WriteLine("{0} - {1}",d,dict[d]);
    }


}

private static Random rand = new Random(DateTime.Now.Millisecond);
private int SelectItem(IEnumerable<int> numbers)
{
    var num = rand.Next(1,numbers.Sum());
    var list = numbers.OrderBy(n=>n)
        .SelectMany(n=> Enumerable.Range(1,n).Select(rr=>n)).ToList();
    //list.GroupBy(x=>x).Dump();
    //Console.WriteLine("Rand num = {0}, selected num = {1}",num,ret);
    return  list[num-1];;
}