像Max-heap和Min-heap一样,我想实现一个中值堆来跟踪给定整数集的中位数。 API应具有以下三个功能:
insert(int) // should take O(logN)
int median() // will be the topmost element of the heap. O(1)
int delmedian() // should take O(logN)
我想使用数组(a)实现来实现堆,其中数组索引k的子节点存储在数组索引2 * k和2 * k + 1中。为方便起见,数组开始填充索引1中的元素。 这是我到目前为止: 中值堆将具有两个整数以跟踪到目前为止插入的整数的数量。当前中位数(gcm)和<当前中位数(lcm)。
if abs(gcm-lcm) >= 2 and gcm > lcm we need to swap a[1] with one of its children.
The child chosen should be greater than a[1]. If both are greater,
choose the smaller of two.
同样适用于其他情况。我无法想出如何沉入和游泳元素的算法。我认为应该考虑这个数字与中位数的接近程度,如下所示:
private void swim(int k) {
while (k > 1 && absless(k, k/2)) {
exch(k, k/2);
k = k/2;
}
}
我无法想出整个解决方案。
答案 0 :(得分:114)
你需要两个堆:一个最小堆和一个最大堆。每个堆包含大约一半的数据。最小堆中的每个元素都大于或等于中位数,并且max-heap中的每个元素都小于或等于中位数。
当最小堆包含的元素多于最大堆时,中位数位于最小堆的顶部。当max-heap包含比min-heap多一个元素时,中位数位于max-heap的顶部。
当两个堆都包含相同数量的元素时,元素总数是偶数。 在这种情况下,你必须根据你的中位数的定义选择:a)两个中间元素的平均值; b)两者中的较大者; c)较小的; d)随意选择两者中的任何一个......
每次插入时,将新元素与堆顶部的元素进行比较,以确定将其插入的位置。如果新元素大于当前中位数,则它将进入最小堆。如果它小于当前中位数,则转到最大堆。然后你可能需要重新平衡。如果堆的大小相差多个元素,则从堆中提取最多/最大值,并将其插入另一个堆中。
为了构造元素列表的中值堆,我们应该首先使用线性时间算法并找到中值。一旦知道了中位数,我们就可以根据中值简单地将元素添加到Min-heap和Max-heap。不需要平衡堆,因为中位数会将元素的输入列表分成相等的一半。
如果提取元素,则可能需要通过将一个元素从一个堆移动到另一个堆来补偿大小更改。这样,您可以确保两个堆在任何时候都具有相同的大小或仅由一个元素区分。
答案 1 :(得分:7)
这是一个MedianHeap的java实现,是在上面comocomocomocomo的解释的帮助下开发的。
import java.util.Arrays;
import java.util.Comparator;
import java.util.PriorityQueue;
import java.util.Scanner;
/**
*
* @author BatmanLost
*/
public class MedianHeap {
//stores all the numbers less than the current median in a maxheap, i.e median is the maximum, at the root
private PriorityQueue<Integer> maxheap;
//stores all the numbers greater than the current median in a minheap, i.e median is the minimum, at the root
private PriorityQueue<Integer> minheap;
//comparators for PriorityQueue
private static final maxHeapComparator myMaxHeapComparator = new maxHeapComparator();
private static final minHeapComparator myMinHeapComparator = new minHeapComparator();
/**
* Comparator for the minHeap, smallest number has the highest priority, natural ordering
*/
private static class minHeapComparator implements Comparator<Integer>{
@Override
public int compare(Integer i, Integer j) {
return i>j ? 1 : i==j ? 0 : -1 ;
}
}
/**
* Comparator for the maxHeap, largest number has the highest priority
*/
private static class maxHeapComparator implements Comparator<Integer>{
// opposite to minHeapComparator, invert the return values
@Override
public int compare(Integer i, Integer j) {
return i>j ? -1 : i==j ? 0 : 1 ;
}
}
/**
* Constructor for a MedianHeap, to dynamically generate median.
*/
public MedianHeap(){
// initialize maxheap and minheap with appropriate comparators
maxheap = new PriorityQueue<Integer>(11,myMaxHeapComparator);
minheap = new PriorityQueue<Integer>(11,myMinHeapComparator);
}
/**
* Returns empty if no median i.e, no input
* @return
*/
private boolean isEmpty(){
return maxheap.size() == 0 && minheap.size() == 0 ;
}
/**
* Inserts into MedianHeap to update the median accordingly
* @param n
*/
public void insert(int n){
// initialize if empty
if(isEmpty()){ minheap.add(n);}
else{
//add to the appropriate heap
// if n is less than or equal to current median, add to maxheap
if(Double.compare(n, median()) <= 0){maxheap.add(n);}
// if n is greater than current median, add to min heap
else{minheap.add(n);}
}
// fix the chaos, if any imbalance occurs in the heap sizes
//i.e, absolute difference of sizes is greater than one.
fixChaos();
}
/**
* Re-balances the heap sizes
*/
private void fixChaos(){
//if sizes of heaps differ by 2, then it's a chaos, since median must be the middle element
if( Math.abs( maxheap.size() - minheap.size()) > 1){
//check which one is the culprit and take action by kicking out the root from culprit into victim
if(maxheap.size() > minheap.size()){
minheap.add(maxheap.poll());
}
else{ maxheap.add(minheap.poll());}
}
}
/**
* returns the median of the numbers encountered so far
* @return
*/
public double median(){
//if total size(no. of elements entered) is even, then median iss the average of the 2 middle elements
//i.e, average of the root's of the heaps.
if( maxheap.size() == minheap.size()) {
return ((double)maxheap.peek() + (double)minheap.peek())/2 ;
}
//else median is middle element, i.e, root of the heap with one element more
else if (maxheap.size() > minheap.size()){ return (double)maxheap.peek();}
else{ return (double)minheap.peek();}
}
/**
* String representation of the numbers and median
* @return
*/
public String toString(){
StringBuilder sb = new StringBuilder();
sb.append("\n Median for the numbers : " );
for(int i: maxheap){sb.append(" "+i); }
for(int i: minheap){sb.append(" "+i); }
sb.append(" is " + median()+"\n");
return sb.toString();
}
/**
* Adds all the array elements and returns the median.
* @param array
* @return
*/
public double addArray(int[] array){
for(int i=0; i<array.length ;i++){
insert(array[i]);
}
return median();
}
/**
* Just a test
* @param N
*/
public void test(int N){
int[] array = InputGenerator.randomArray(N);
System.out.println("Input array: \n"+Arrays.toString(array));
addArray(array);
System.out.println("Computed Median is :" + median());
Arrays.sort(array);
System.out.println("Sorted array: \n"+Arrays.toString(array));
if(N%2==0){ System.out.println("Calculated Median is :" + (array[N/2] + array[(N/2)-1])/2.0);}
else{System.out.println("Calculated Median is :" + array[N/2] +"\n");}
}
/**
* Another testing utility
*/
public void printInternal(){
System.out.println("Less than median, max heap:" + maxheap);
System.out.println("Greater than median, min heap:" + minheap);
}
//Inner class to generate input for basic testing
private static class InputGenerator {
public static int[] orderedArray(int N){
int[] array = new int[N];
for(int i=0; i<N; i++){
array[i] = i;
}
return array;
}
public static int[] randomArray(int N){
int[] array = new int[N];
for(int i=0; i<N; i++){
array[i] = (int)(Math.random()*N*N);
}
return array;
}
public static int readInt(String s){
System.out.println(s);
Scanner sc = new Scanner(System.in);
return sc.nextInt();
}
}
public static void main(String[] args){
System.out.println("You got to stop the program MANUALLY!!");
while(true){
MedianHeap testObj = new MedianHeap();
testObj.test(InputGenerator.readInt("Enter size of the array:"));
System.out.println(testObj);
}
}
}
答案 2 :(得分:3)
这里我的代码基于comocomocomocomo提供的答案:
import java.util.PriorityQueue;
public class Median {
private PriorityQueue<Integer> minHeap =
new PriorityQueue<Integer>();
private PriorityQueue<Integer> maxHeap =
new PriorityQueue<Integer>((o1,o2)-> o2-o1);
public float median() {
int minSize = minHeap.size();
int maxSize = maxHeap.size();
if (minSize == 0 && maxSize == 0) {
return 0;
}
if (minSize > maxSize) {
return minHeap.peek();
}if (minSize < maxSize) {
return maxHeap.peek();
}
return (minHeap.peek()+maxHeap.peek())/2F;
}
public void insert(int element) {
float median = median();
if (element > median) {
minHeap.offer(element);
} else {
maxHeap.offer(element);
}
balanceHeap();
}
private void balanceHeap() {
int minSize = minHeap.size();
int maxSize = maxHeap.size();
int tmp = 0;
if (minSize > maxSize + 1) {
tmp = minHeap.poll();
maxHeap.offer(tmp);
}
if (maxSize > minSize + 1) {
tmp = maxHeap.poll();
minHeap.offer(tmp);
}
}
}
答案 3 :(得分:1)
这是一个Scala实现,遵循上面的comocomocomocomo的想法。
class MedianHeap(val capacity:Int) {
private val minHeap = new PriorityQueue[Int](capacity / 2)
private val maxHeap = new PriorityQueue[Int](capacity / 2, new Comparator[Int] {
override def compare(o1: Int, o2: Int): Int = Integer.compare(o2, o1)
})
def add(x: Int): Unit = {
if (x > median) {
minHeap.add(x)
} else {
maxHeap.add(x)
}
// Re-balance the heaps.
if (minHeap.size - maxHeap.size > 1) {
maxHeap.add(minHeap.poll())
}
if (maxHeap.size - minHeap.size > 1) {
minHeap.add(maxHeap.poll)
}
}
def median: Double = {
if (minHeap.isEmpty && maxHeap.isEmpty)
return Int.MinValue
if (minHeap.size == maxHeap.size) {
return (minHeap.peek+ maxHeap.peek) / 2.0
}
if (minHeap.size > maxHeap.size) {
return minHeap.peek()
}
maxHeap.peek
}
}
答案 4 :(得分:0)
另一种不使用最大堆和最小堆的方法是立即使用中值堆。
在最大堆中,父级大于子级。 我们可以使用一种新的堆,其中父级位于子级的“中间”-左子级小于父级,右子级大于父级。所有偶数项都是左子代,所有奇数项都是右子代。
可以在最大堆中执行相同的游泳和下沉操作,也可以在此中值堆中执行-稍加修改。在最大堆的典型游泳操作中,插入的项会游动直到小于父项,在此,在中间堆中,它将游动直到小于父项(如果是奇数项) )或大于父项(如果它是偶数项)。
这是我对中位数堆的实现。为了简单起见,我使用了一个整数数组。
package priorityQueues;
导入edu.princeton.cs.algs4.StdOut;
公共类MedianInsertDelete {
private Integer[] a;
private int N;
public MedianInsertDelete(int capacity){
// accounts for '0' not being used
this.a = new Integer[capacity+1];
this.N = 0;
}
public void insert(int k){
a[++N] = k;
swim(N);
}
public int delMedian(){
int median = findMedian();
exch(1, N--);
sink(1);
a[N+1] = null;
return median;
}
public int findMedian(){
return a[1];
}
//条目游动,因此其左孩子变小而右孩子变大 私人虚空游泳(int k){
while(even(k) && k>1 && less(k/2,k)){
exch(k, k/2);
if ((N > k) && less (k+1, k/2)) exch(k+1, k/2);
k = k/2;
}
while(!even(k) && (k>1 && !less(k/2,k))){
exch(k, k/2);
if (!less (k-1, k/2)) exch(k-1, k/2);
k = k/2;
}
}
//如果左边的孩子较大或右边的孩子较小,则条目下沉 私人虚空沉(int k){
while(2*k <= N){
int j = 2*k;
if (j < N && less (j, k)) j++;
if (less(k,j)) break;
exch(k, j);
k = j;
}
}
private boolean even(int i){
if ((i%2) == 0) return true;
else return false;
}
private void exch(int i, int j){
int temp = a[i];
a[i] = a[j];
a[j] = temp;
}
private boolean less(int i, int j){
if (a[i] <= a[j]) return true;
else return false;
}
public static void main(String[] args) {
MedianInsertDelete medianInsertDelete = new MedianInsertDelete(10);
for(int i = 1; i <=10; i++){
medianInsertDelete.insert(i);
}
StdOut.println("The median is: " + medianInsertDelete.findMedian());
medianInsertDelete.delMedian();
StdOut.println("Original median deleted. The new median is " + medianInsertDelete.findMedian());
}
}