我想制作一种算法,使用一些理想的百分比将实体总数分配到不同的类别。比方说,例如,第一类应包含所有实体的50.3%,第二类--34.3%,第三类--15.4%。
在一些理想的条件下,一切都很好。我很容易为每个类别计算实体的期望值,然后使用类似于this的一些算法来维持正确的总和。 小例子:
All categories are empty
We are adding 100
To the first 50
To the second = 34
To the third = 16 (fixed by current algorithm, was 15)
但是,在某些点上,类别已经包含了一些未按照理想百分比分配的实体!我不知道在这种情况下我应该使用什么算法。该算法应使用以下规则分发新实体:
示例:
At start:
First is empty
Second contains 10
Third contains 40
We are adding 50
To the first 38
To the second 12
To the third 0 (too much entities here before adding)
Result:
First = 38; deviation = 12.3%
Second = 22; deviation = 12.3%
Third = 40; deviation = 24.6%
不要问我如何得到38和12,我只是尝试了不同的组合,直到它看起来正确。 有关算法的任何想法吗?
答案 0 :(得分:0)
以下方法可能有效。让我们假设您维护3个列表,每个类别1个,运行平均值(即列表的当前平均值)和总元素。需要额外的2个元素来确保添加元素的复杂性保持不变。
数据结构:
category_list {
items : []
average : float
count : int
}
category_list categories[3]; # 1 for each category
avg_distribution[3]; # this will hold the distribution allowed for each category
<强>算法:强>
addItems(int items[]) {
for item in items:
category = getCategory(item)
insertItem(category, item)
}
# This algo will identify the category to be inserted in constant time or the no.of categories available.
# For this scenario we can say it runs in O(1)
int getCategory(int item) {
minAvg = INT_MAX
minCat = 0
for cat in range(3):
category = categories[cat]
newAvg = (category.avg*(category.count)+item.value) / (category.count+1)
newAvg = newAvg - avg_distribution[cat]
# this is to make sure we consider only the deviation part and not the entire avg.
if newAvg < minAvg:
minAvg = minAvg
minCat = cat
return minCat
}
# This will insert item in a given category in O(1)
insertItem(category, item) {
category.items[category.count] = item.value
category.avg = (category.avg*(category.count)+item.value) / (category.count+1)
category.count++;
}
# This method is need to initially load all the elements in given category
loadAllItems(int items[], int category) {
category = categories[category]
for item in items:
insertItem(category, item)
}
希望有所帮助!