Question

准确的描述是：

数组的长度为n，由键值对组成。键和值都是正整数，每个数组中的键都是唯一的。

有x个数组，每个数组中的键不完全相同。

现在，我需要合并这些x数组，方法是使用相同的键添加值，然后按值找出前n个对。如果n+1对与最高n对具有相同的值，请忽略它。

如果需要，您可以假设每个数组都按键或值排序。

e.g。 n=3和x=3：

3长3阵列：[(1,2),(2,3),(3,4)],[(2,3),(3,2),(4,1)],[(2,2),(5,2),(6,2)]

合并：[(1,2),(2,8),(3,6),(4,1),(5,2),(6,2)]

前三名：[(2,8),(3,6),(5,2)]

那么最好的算法是什么？

Answer 1

按密钥执行k-way合并，并将输出存储在优先级队列中 - 通常是另一个最小堆。

数组按键按升序排序。因此，您创建一个最小堆并将堆中每个数组的第一个项放入。然后你不断删除最低项目，累计相同项目的总数。当每个项目从合并堆中删除时，您将添加项目来自的数组中的下一个项目。输出在大小为K的最小堆上，其中K是输出中所需的项目数。

输出堆比较值而不是键。

快速伪代码示例：

heapNode = struct
{
    parray // reference to the input array
    index  // current array index. Start at 0 and increment each time.
}

mergeHeap = new MinHeap
outputHeap = new MinHeap

for each array
    mergeHeap.Add(new heapNode(array, 0)

previousKey = -1
previousValue = 0

while mergeHeap is not empty
{
    node = mergeHeap.RemoveSmallest
    key = node.array[node.index].key
    value = node.array[node.index].value
    // update the array index and put it back on the heap
    node.index = node.index + 1
    if (node.index < node.array.length)
        mergeHeap.Add(node)

    // now accumulate totals
    if (key == previousKey)
    {
        previousValue += value;
    }
    else
    {
        // new key. Output previous one
        if (previousKey != -1)
        {
            if (outputHeap.Count < numberToOutput)
                outputHeap.Add(new outputNode(previousKey, previousValue)
            else if (previousValue > outputHeap.Peek().value)
            {
               // the value is more than the lowest value already on the output heap
               outputHeap.RemoveSmallest
               outputHeap.Add(new outputNode(previousKey, previousValue)
            }
        }
        previousKey = key
        previousValue = value
    }
}

// here you'll have to check the previousKey and previousValue
// to see if they need to go onto the output heap.

// You now have your highest value items on the output heap. Just remove them:
while outputHeap is not empty
    node = outputHeap.RemoveSmallest
    outputNode(node)

这是基本想法。只需要一堆。

合并排序数组的算法

1 个答案: