找到两个排序数组的前k个和

时间:2011-02-15 06:24:53

标签: algorithm

您将获得两个分别排列的数组,大小分别为n和m。您的任务(如果您选择接受它)是输出a[i]+b[j]形式的最大k和。

O(k log k)解can be found here。有传言称O(k)或O(n)解决方案。是否存在?

4 个答案:

答案 0 :(得分:11)

我发现您链接的回复大多模糊且结构不合理。这是 O(k * log(min(m,n))) O(k * log(m + n)) O(k * log( k))算法。

假设它们按降序排序。想象一下,您计算了总和的m * n矩阵,如下所示:

for i from 0 to m
    for j from 0 to n
        sums[i][j] = a[i] + b[j]

在此矩阵中,值向下和向右单调递减。考虑到这一点,这里有一个算法,它按照递减总和的顺序对这个矩阵进行图搜索。

q : priority queue (decreasing) := empty priority queue
add (0, 0) to q with priority a[0] + b[0]
while k > 0:
    k--
    x := pop q
    output x
    (i, j) : tuple of int,int := position of x
    if i < m:
        add (i + 1, j) to q with priority a[i + 1] + b[j]
    if j < n:
        add (i, j + 1) to q with priority a[i] + b[j + 1]

分析:

  1. 循环执行k次。
    1. 每次迭代都有一个弹出操作。
    2. 每次迭代最多有两个插入操作。
  2. 优先级队列的最大大小为 O(min(m,n)) O(m + n) O(k)。
  3. 可以使用二进制堆实现优先级队列,并给出log(size)pop和insert。
  4. 因此该算法 O(k * log(min(m,n))) O(k * log(m + n)) O(k *日志(K))。
  5. 请注意,需要修改常规优先级队列抽象数据类型以忽略重复条目。或者,您可以维护一个单独的集合结构,在添加到队列之前首先检查集合中的成员资格,并在从队列中弹出后从集合中删除。这些想法都不会加剧时间或空间的复杂性。

    如果有任何兴趣,我可以用Java写出来。

    编辑:固定的复杂性。 一种具有我所描述的复杂性的算法,但它与此略有不同。您必须小心避免添加某些节点。我的简单解决方案过早地将许多节点添加到队列中。

答案 1 :(得分:1)

private static class FrontierElem implements Comparable<FrontierElem> {
    int value;
    int aIdx;
    int bIdx;

    public FrontierElem(int value, int aIdx, int bIdx) {
        this.value = value;
        this.aIdx = aIdx;
        this.bIdx = bIdx;
    }

    @Override
    public int compareTo(FrontierElem o) {
        return o.value - value;
    }

}

public static void findMaxSum( int [] a, int [] b, int k ) {
    Integer [] frontierA = new Integer[ a.length ];
    Integer [] frontierB = new Integer[ b.length ];
    PriorityQueue<FrontierElem> q = new PriorityQueue<MaxSum.FrontierElem>();
    frontierA[0] = frontierB[0]=0;
    q.add( new FrontierElem( a[0]+b[0], 0, 0));
    while( k > 0 ) {
        FrontierElem f = q.poll();
        System.out.println( f.value+"    "+q.size() );
        k--;
        frontierA[ f.aIdx ] = frontierB[ f.bIdx ] = null;
        int fRight = f.aIdx+1;
        int fDown = f.bIdx+1;
        if( fRight < a.length && frontierA[ fRight ] == null ) {
            q.add( new FrontierElem( a[fRight]+b[f.bIdx], fRight, f.bIdx));
            frontierA[ fRight ] = f.bIdx;
            frontierB[ f.bIdx ] = fRight;
        }
        if( fDown < b.length && frontierB[ fDown ] == null ) {
            q.add( new FrontierElem( a[f.aIdx]+b[fDown], f.aIdx, fDown));
            frontierA[ f.aIdx ] = fDown;
            frontierB[ fDown ] = f.aIdx;
        }
    }
}

这个想法与其他解决方案类似,但是观察到当您从矩阵添加到结果集时,在每个步骤中,集合中的下一个元素只能来自当前集合凹陷的位置。我将这些元素称为前沿元素,并在两个数组中跟踪它们的位置,并将它们的值保存在优先级队列中。这有助于保持队列大小,但我还有多少要弄明白。它似乎是关于sqrt( k )但我对此并不完全确定。

(当然,frontierA / B数组可能是简单的布尔数组,但这样它们完全定义了我的结果集,在本例中没有使用过,但在其他地方可能没用。)

答案 2 :(得分:0)

由于前置条件是数组已排序,因此我们考虑以下内容 对于N = 5;

A [] = {1,2,3,4,5}

B [] = {496,497,498,499,500}

现在,因为我们知道A&amp; B的N-1的总和将是最高的,因此只需将其与A&amp;的索引一起插入堆中。 B元素(为什么,索引?我们会在短时间内知道)

H.insert(A [N-1] + B [N-1],N-1,N-1);

现在

 while(!H.empty()) { // the time heap is not empty 

 H.pop(); // this will give you the sum you are looking for 

 The indexes which we got at the time of pop, we shall use them for selecting the next sum element.

 Consider the following :
 if we have i & j as the indexes in A & B , then the next element would be  max ( A[i]+B[j-1], A[i-1]+B[j], A[i+1]+B[j+1] ) , 
 So, insert the same if that has not been inserted in the heap
 hence
 (i,j)= max ( A[i]+B[j-1], A[i-1]+B[j], A[i+1]+B[j+1] ) ;
 if(Hash[i,j]){ // not inserted 
    H.insert (i,j);
 }else{
    get the next max from max ( A[i]+B[j-1], A[i-1]+B[j], A[i+1]+B[j+1] ) ; and insert.                      
 }

 K pop-ing them will give you max elements required.

希望这有帮助

答案 3 :(得分:0)

非常感谢@rlibby和@xuhdev具有解决此类问题的原始想法。我曾进行过类似的编码练习访谈,要求找到K个降序排列的数组中K个元素形成的N个最大和-意味着我们必须从每个排序的数组中选取1个元素来构建最大的和。

Example: List findHighestSums(int[][] lists, int n) {}

[5,4,3,2,1]
[4,1]
[5,0,0]
[6,4,2]
[1]

and a value of 5 for n, your procedure should return a List of size 5:

[21,20,19,19,18]

下面是我的代码,请仔细查看那些块注释:D

private class Pair implements Comparable<Pair>{
    String state;

    int sum;

    public Pair(String state, int sum) {
        this.state = state;
        this.sum = sum;
    }

    @Override
    public int compareTo(Pair o) {
        // Max heap
        return o.sum - this.sum;
    }
}

List<Integer> findHighestSums(int[][] lists, int n) {

    int numOfLists = lists.length;
    int totalCharacterInState = 0;

    /*
     * To represent State of combination of largest sum as String
     * The number of characters for each list should be Math.ceil(log(list[i].length))
     * For example: 
     *      If list1 length contains from 11 to 100 elements
     *      Then the State represents for list1 will require 2 characters
     */
    int[] positionStartingCharacterOfListState = new int[numOfLists + 1];
    positionStartingCharacterOfListState[0] = 0;

    // the reason to set less or equal here is to get the position starting character of the last list
    for(int i = 1; i <= numOfLists; i++) {  
        int previousListNumOfCharacters = 1;
        if(lists[i-1].length > 10) {
            previousListNumOfCharacters = (int)Math.ceil(Math.log10(lists[i-1].length));
        }
        positionStartingCharacterOfListState[i] = positionStartingCharacterOfListState[i-1] + previousListNumOfCharacters;
        totalCharacterInState += previousListNumOfCharacters;
    }

    // Check the state <---> make sure that combination of a sum is new
    Set<String> states = new HashSet<>();
    List<Integer> result = new ArrayList<>();
    StringBuilder sb = new StringBuilder();

    // This is a max heap contain <State, largestSum>
    PriorityQueue<Pair> pq = new PriorityQueue<>();

    char[] stateChars = new char[totalCharacterInState];
    Arrays.fill(stateChars, '0');
    sb.append(stateChars);
    String firstState = sb.toString();
    states.add(firstState);

    int firstLargestSum = 0;
    for(int i = 0; i < numOfLists; i++) firstLargestSum += lists[i][0];

    // Imagine this is the initial state in a graph
    pq.add(new Pair(firstState, firstLargestSum));

    while(n > 0) {
        // In case n is larger than the number of combinations of all list entries 
        if(pq.isEmpty()) break;
        Pair top = pq.poll();
        String currentState = top.state;
        int currentSum = top.sum;

        /*
         * Loop for all lists and generate new states of which only 1 character is different from the former state  
         * For example: the initial state (Stage 0) 0 0 0 0 0
         * So the next states (Stage 1) should be:
         *  1 0 0 0 0
         *  0 1 0 0 0 (choose element at index 2 from 2nd array)
         *  0 0 1 0 0 (choose element at index 2 from 3rd array)
         *  0 0 0 0 1 
         * But don't forget to check whether index in any lists have exceeded list's length
         */
        for(int i = 0; i < numOfLists; i++) {
            int indexInList = Integer.parseInt(
                    currentState.substring(positionStartingCharacterOfListState[i], positionStartingCharacterOfListState[i+1]));
            if( indexInList < lists[i].length - 1) {
                int numberOfCharacters = positionStartingCharacterOfListState[i+1] - positionStartingCharacterOfListState[i];
                sb = new StringBuilder(currentState.substring(0, positionStartingCharacterOfListState[i]));
                sb.append(String.format("%0" + numberOfCharacters + "d", indexInList + 1));
                sb.append(currentState.substring(positionStartingCharacterOfListState[i+1]));
                String newState = sb.toString();
                if(!states.contains(newState)) {

                    // The newSum is always <= currentSum
                    int newSum = currentSum - lists[i][indexInList] + lists[i][indexInList+1];

                    states.add(newState);
                    // Using priority queue, we can immediately retrieve the largest Sum at Stage k and track all other unused states.
                    // From that Stage k largest Sum's state, then we can generate new states
                    // Those sums composed by recently generated states don't guarantee to be larger than those sums composed by old unused states.
                    pq.add(new Pair(newState, newSum));
                }

            }
        }
        result.add(currentSum);
        n--;
    }
    return result;
}

让我解释一下我如何提出解决方案:

  1. 我的答案中的while循环执行N次,请考虑最大堆 (优先级队列)。
  2. 轮询操作1次,复杂度为O(log( sumOfListLength)),因为最大元素Pair in 堆是sumOfListLength。
  3. 插入操作最多可进行K次, 每次插入的复杂度为log(sumOfListLength)。 因此,复杂度为 O(N * log(sumOfListLength))