我试图解决一个相当简单的问题,其中我有一个项目列表,其中找到项目的概率也取决于项目本身(我想在大海捞针中找到铲子比找到针头更容易)。
我想要一种随机返回其中一项的方法,同时考虑找到每一项的可能性。
所以这些项目可以这样列出:
A - 100
B - 50
C - 10
数字表示查找项目的容易程度,更高的值可以更容易找到。
运行以下方法10000次导致找到这些数量的项目:
A - 6249 (100 / 160 = 0,625)
B - 3139 (50 / 160 = 0,3125)
C - 612 (10 / 160 = 0,0625)
这几乎证明了以下代码的有效性。
所以现在我的问题是,考虑到列表本身可以包含数千个项目,如何改进呢?现在,该方法将在最坏的情况下在列表中的每个项目上运行至少一次,即O(n)。
是否可以将其写入LINQ / LAMBDA语句,以便SQL服务器可以处理它而不是将所有项目提升到C#?
public long GetRandomItem()
{
var allItems = _db.AllItems
.Where(x => x.CanBeFound == true)
.OrderByDescending(x => x.Rarity)
.Select(x => new
{
x.Id, // id of item
x.Rarity, // rarity between 1 and 100
}).ToList();
int totalRarity = allItems.Sum(x => x.Rarity);
var random = new Random(DateTime.Now.Millisecond);
var randomNumber = random.NextDouble() * totalRarity;
double totalSoFar = 0;
long chosenId = -1;
foreach (var i in allItems)
{
totalSoFar += i.Rarity;
if (totalSoFar > randomNumber)
{
chosenId = i.Id;
break;
}
}
return chosenId;
}
-----编辑------
将LINQ重新编译为仅对数据库进行两次查询的版本,并且不需要循环。不完全确定这是否更好,因为这会强制SQL执行更多的连接和数据选择。
public long GetRandomGamePiece()
{
int totalRarity = _db.GamePieceTemplates.Sum(x => x.Rarity);
var randomNumber = 1 + Math.Round(_Random.NextDouble() * (totalRarity - 1));
var randomItem = _db.GamePieceTemplates
.Where(x => x.CanBeFound == true)
.OrderBy(x => x.Id)
.Select((x) => new
{
x.Id, // id of item
x.Rarity, // rarity between 1 and 100
// +1 so that it dosent overlap previous level
MinRarity = _db.GamePieceTemplates.Where(y => y.Id <= x.Id).Sum(y => y.Rarity) - x.Rarity + 1,
MaxRarity = _db.GamePieceTemplates.Where(y => y.Id <= x.Id).Sum(y => y.Rarity)
})
.Single(x => x.MinRarity <= randomNumber && x.MaxRarity >= randomNumber);
long chosenId = -1;
return randomItem.Id;
}
这会转换为此TSQL:
SELECT TOP (2)
[Project6].[Rarity] AS [Rarity],
[Project6].[Id] AS [Id],
[Project6].[C1] AS [C1],
[Project6].[C2] AS [C2]
FROM ( SELECT
[Project5].[Id] AS [Id],
[Project5].[Rarity] AS [Rarity],
([Project5].[C1] - [Project5].[Rarity]) + 1 AS [C1],
[Project5].[C2] AS [C2]
FROM ( SELECT
[Project4].[Id] AS [Id],
[Project4].[Rarity] AS [Rarity],
[Project4].[C1] AS [C1],
(SELECT
SUM([Extent5].[Rarity]) AS [A1]
FROM [dbo].[GamePieceTemplates] AS [Extent5]
WHERE [Extent5].[Id] <= [Project4].[Id]) AS [C2]
FROM ( SELECT
[Project3].[Id] AS [Id],
[Project3].[Rarity] AS [Rarity],
(SELECT
SUM([Extent4].[Rarity]) AS [A1]
FROM [dbo].[GamePieceTemplates] AS [Extent4]
WHERE [Extent4].[Id] <= [Project3].[Id]) AS [C1]
FROM ( SELECT
[Project2].[Id] AS [Id],
[Project2].[Rarity] AS [Rarity]
FROM ( SELECT
[Project1].[Id] AS [Id],
[Project1].[Rarity] AS [Rarity],
[Project1].[C1] AS [C1],
(SELECT
SUM([Extent3].[Rarity]) AS [A1]
FROM [dbo].[GamePieceTemplates] AS [Extent3]
WHERE [Extent3].[Id] <= [Project1].[Id]) AS [C2]
FROM ( SELECT
[Extent1].[Id] AS [Id],
[Extent1].[Rarity] AS [Rarity],
(SELECT
SUM([Extent2].[Rarity]) AS [A1]
FROM [dbo].[GamePieceTemplates] AS [Extent2]
WHERE [Extent2].[Id] <= [Extent1].[Id]) AS [C1]
FROM [dbo].[GamePieceTemplates] AS [Extent1]
WHERE 1 = [Extent1].[CanBeFound]
) AS [Project1]
) AS [Project2]
WHERE ( CAST( ([Project2].[C1] - [Project2].[Rarity]) + 1 AS float) <= 130) AND ( CAST( [Project2].[C2] AS float) >= 130)
) AS [Project3]
) AS [Project4]
) AS [Project5]
) AS [Project6]
ORDER BY [Project6].[Id] ASC
答案 0 :(得分:1)
我这样做的方法是根据选项总数进行简单的计算。不需要循环 - 随机值本身决定了结果。
伪代码将是:
int maxValueA = 100;
int maxValueB = 50;
int maxValueC = 10;
int total = maxValueA + maxValueB + maxValueC;
int x = random number between zero and total;
if (x <= maxValueA) return A;
else if (x <= maxValueA + maxValueB) return B;
else return C;
因此,如果您有一个有序的结果列表,那么您真正需要做的就是选择结果集中与随机数对应的项目。
实际使用此方法是根据ID发生的概率(再次,伪代码)来填充数组:
int[] IDsList = { A, A, A, A, B, B, C }; // ID's populated based on % chance being chosen
x = random int between 0 and IDsList.Count;
return IDsList[x];
答案 1 :(得分:1)
如果可以向数据添加新列,则可以在SQL中执行此操作。这个新专栏将包括到目前为止“可能性”的总和。按列排序,您会看到样本值如下所示:
Id AccumP
A 100
B 150
C 160
如果您维护该属性,则可以通过以下方式找到加权随机项:
AccumP
排序的最后一项。AccumP
之间的随机数。AccumP
值大于随机AccumP
但最接近它的项目。这是您的加权随机结果。如果你索引AccumP
,这应该很快!
答案 2 :(得分:0)
另一种方法 - 创建一个列表,每个数字都按重复次数复制。
10出现10次,50出现50次 - 然后获得1和列表项数量之间的随机数,这给出了一个索引,然后用它来获取该索引处的列表项。
void Main()
{
var items = new int [] {100,50,10};
var dict = new Dictionary<int,int>();
var test = Enumerable.Range(1,10000);
foreach (var t in test)
{
var result = SelectItem(items);
if (!dict.ContainsKey(result))
{
dict.Add(result,0);
}
dict[result]++;
}
foreach (var d in dict.Keys)
{
Console.WriteLine("{0} - {1}",d,dict[d]);
}
}
private static Random rand = new Random(DateTime.Now.Millisecond);
private int SelectItem(IEnumerable<int> numbers)
{
var num = rand.Next(1,numbers.Sum());
var list = numbers.OrderBy(n=>n)
.SelectMany(n=> Enumerable.Range(1,n).Select(rr=>n)).ToList();
//list.GroupBy(x=>x).Dump();
//Console.WriteLine("Rand num = {0}, selected num = {1}",num,ret);
return list[num-1];;
}