Question

我需要找到最有效的方法来从最受欢迎的类别中找到随机元素

这

4 Cheese
1 Olive
2 Mushroom
4 Ham
2 Chicken
4 Salad

我想要Cheese或Ham或Salad。如果有多个顶级类别，我不关心我将从哪个类别获取我的项目。

在输入中我Iterator<Foo> Foo实现了

Interface Foo {
    int getCategory();
}

我目前的代码如下：

Foo getItem(Iterator<Foo> it) {
    Map<Integer, List<Foo>> categoryMap = new HashMap<Integer, List<Foo>>();
    while(it.hasNext()) {
        Foo foo = it.next();
        int category = foo.getCategory();

        List<Foo> l = categoryMap.get(category);
        if(l == null) {
            l = new ArrayList<Foo>();
            categoryMap.put(category, l);
        }

        l.add(foo);
    }

    int longest_list_size = 0;
    int longest_category_id = -1;

    Set<Integer> categories = categoryMap.keySet()

    for(Integer c:  categories ) {
        int list_size = categoryMap.get(c).size();
        if(list_size  > longest_list_size) {
           longest_list_size = list_size;
           longest_category_id = c;
        }
    }

    if(longest_list_size == 0)
        return null;

    int r = new Random().nextInt(longest_list_size);
    return categoryMap.get(c).get(r);
}

Answer 1

拥有2张地图可能更快：

Foo getItem(Iterator<Foo> it) {
    Map<Integer, Foo> categoryToFoo = new HashMap<Integer, Foo>();
    Map<Integer, Integer> counts = new HashMap<Integer, Integer>();
    int maxCount = 0;
    while(it.hasNext()) {
        Foo foo = it.next();
        int category = foo.getCategory();
        int categoryCount = 1;
        if ( ! categoryToFoo.contains( category ) ) {
            categoryToFoo.put( category, foo );
        }
        else {
            categoryCount = counts.get( category ) + 1;
        }
        counts.put( category, categoryCount );
        if ( categoryCount > maxCount ) {
            maxCount = categoryCount;
        }
    }

    List<Foo> possible = new ArrayList<Foo>()
    for ( Map.Entry entry : counts.entrySet() ) {
        if ( entry.getValue() == maxCount ) {
            possible.add( categoryToFoo.get( entry.getKey() ) );
        }
    }

    return possible.get( new Random().nextInt( possible.size() ) );
}

你可以在很多地方做进一步的优化，但你明白了。

Answer 2

以下是我要做的事情：

从List<Foo>

it

按类别对列表进行排序
从头开始浏览列表并存储具有相同类别的最长间隔的开始和结束索引
在开始和结束索引之间选择一个随机元素

我认为使用更少的代码会更快一些，但您的解决方案也很好。

如果您真的关心效果，因为it可能有数百万个元素，那么您不应该首先使用此Iterator。在这种情况下，您应该将每个类别的受欢迎程度存储在一个Map中，并将相同项目的列表存储在另一个Map中，但我对其余代码一无所知。< / p>

Answer 3

确实很难（如果不是不可能的话）改进你的方法，至少是复杂的。我们来分析吧。你在做什么

插入地图 - ＆gt; O（N）
最大值的计算 - ＆gt; O（N）

总计：O（N）

其他方法：

优先级队列 - ＆gt; O（N * log（N））插入所有元素+ O（1）检索头
按键O（N * log（N））+ O（1）检索第一个
如果您知道投票计数的间隔，例如[0..K]并且它比N少或不高，您可以在O（K）+ O（1）中进行计数排序以取最大值。

如果您只需要一次最大检索，那么您的方法就足够了，IMO。

Java：来自最受欢迎类别的随机元素

3 个答案: