中位数之和(更快的解决方案)

时间:2016-06-05 09:58:26

标签: python algorithm optimization binary-search-tree

我们给出了数字N - 下一个列表的长度(1 <= N <= 10 ^ 5)。

然后有一个N个数字的列表(1 <= num&lt; = 10 ^ 9)。

任务是在每次迭代中找到1到N的中位数(在第i次迭代中我们找到子数组lst [:i]的中值),然后找到所有N个中值的总和。

Exampes

输入:

10

5 10 8 1 7 3 9 6 2 4

输出:

59(5 + 5 + 8 + 5 + 7 + 5 + 7 + 6 + 6 + 5)

输入2:

5

5 3 1 2 4

输出2:

16(5 + 3 + 3 + 2 + 3)

Approach for better solution - Sum of medians - 这里提供使用BinarySearchTrees,我做到了。

但这些限制通过2秒的时间限制是不够的。有更快的解决方案吗?

class BinarySearchTree:
    def __init__(self, value):
        self.left = None
        self.right = None
        self.value = value

    def insert(self, value):
        if self.value:
            if value < self.value:
                if self.left is None:
                    self.left = BinarySearchTree(value)
                else:
                    self.left.insert(value)
            elif value > self.value:
                if self.right is None:
                    self.right = BinarySearchTree(value)
                else:
                    self.right.insert(value)
        else:
            self.value = value

    def output_subtree(self):
        if self.left:
            self.left.output_subtree()
        sub_tree.append(self.value)
        if self.right:
            self.right.output_subtree()


N = int(input())
vertices = list(map(int, input().split()))
medians = 0

tree = BinarySearchTree(vertices[0])
medians += vertices[0]

for i in range(1, N):
    sub_tree = []
    tree.insert(vertices[i])
    tree.output_subtree()
    if (i+1) % 2 == 0:
        medians += sub_tree[len(sub_tree)//2-1]
    else:
        medians += sub_tree[len(sub_tree)//2]

print(medians)

2 个答案:

答案 0 :(得分:1)

您可以使用双堆方式。

使用length = N/2

创建两个数组

第一个包含min binary heap,第二个包含最大二进制堆。最小堆将存储大值,最大堆 - 小值

在每次迭代中,将给定列表中的下一个元素添加到其中一个堆中,保持相同的大小(奇数计数器几乎相等)。

如果当前元素大于当前中位数:
   如果最小堆大小等于最大堆大小,则删除最小堆的顶部,将该顶部插入最大堆
   将当前元素添加到最小堆中。

如果当前元素小于当前中位数:
   如果最大堆大小大于最小堆大小,则将max-heap的顶部移动到min-heap
  将当前元素插入max-heap

在max-heap的每个阶段top元素之后是中值。

此算法为O(NlogN),但由于隐藏常量较小,堆的工作速度比搜索树快,并且不需要内存重新分配。

     min heap         max heap
5    -               (5)
10   10              (5)
8    10              (8) 5
1    8 10            (5) 1
7    8 10            (7) 5 1
3    7 8 10          (5) 3 1 
9    8 9 10          (7) 5 3 1 
6    7 8 9 10        (6) 5 3 1
...

答案 1 :(得分:0)

感谢@MBo,我使用 MinHeap MaxHeap 为此问题实施了解决方案。

MinHeap 中,顶部的最小值和任何子级都大于其父级。相反, MaxHeap 包含所有小元素,其中最大的元素位于根目录。

这种结构让我们可以在每次迭代时轻松更新meadian的值。

class MaxHeap:
    def __init__(self):
        self.heapList = [0]
        self.currentSize = 0

    def percUp(self,i):
        while i // 2 > 0:
          if self.heapList[i] > self.heapList[i // 2]:
             tmp = self.heapList[i // 2]
             self.heapList[i // 2] = self.heapList[i]
             self.heapList[i] = tmp
          i = i // 2

    def insert(self,k):
      self.heapList.append(k)
      self.currentSize = self.currentSize + 1
      self.percUp(self.currentSize)

    def percDown(self,i):
      while (i * 2) <= self.currentSize:
          mc = self.maxChild(i)
          if self.heapList[i] < self.heapList[mc]:
              tmp = self.heapList[i]
              self.heapList[i] = self.heapList[mc]
              self.heapList[mc] = tmp
          i = mc

    def maxChild(self,i):
      if i * 2 + 1 > self.currentSize:
          return i * 2
      else:
          if self.heapList[i*2] > self.heapList[i*2+1]:
              return i * 2
          else:
              return i * 2 + 1

    def delMax(self):
      retval = self.heapList[1]
      self.heapList[1] = self.heapList[self.currentSize]
      self.currentSize = self.currentSize - 1
      self.heapList.pop()
      self.percDown(1)
      return retval

    def buildHeap(self,alist):
      i = len(alist) // 2
      self.currentSize = len(alist)
      self.heapList = [0] + alist[:]
      while (i > 0):
          self.percDown(i)
          i = i - 1


class MinHeap:
    def __init__(self):
        self.heapList = [0]
        self.currentSize = 0

    def percUp(self,i):
        while i // 2 > 0:
          if self.heapList[i] < self.heapList[i // 2]:
             tmp = self.heapList[i // 2]
             self.heapList[i // 2] = self.heapList[i]
             self.heapList[i] = tmp
          i = i // 2

    def insert(self,k):
      self.heapList.append(k)
      self.currentSize = self.currentSize + 1
      self.percUp(self.currentSize)

    def percDown(self,i):
      while (i * 2) <= self.currentSize:
          mc = self.minChild(i)
          if self.heapList[i] > self.heapList[mc]:
              tmp = self.heapList[i]
              self.heapList[i] = self.heapList[mc]
              self.heapList[mc] = tmp
          i = mc

    def minChild(self,i):
      if i * 2 + 1 > self.currentSize:
          return i * 2
      else:
          if self.heapList[i*2] < self.heapList[i*2+1]:
              return i * 2
          else:
              return i * 2 + 1

    def delMin(self):
      retval = self.heapList[1]
      self.heapList[1] = self.heapList[self.currentSize]
      self.currentSize = self.currentSize - 1
      self.heapList.pop()
      self.percDown(1)
      return retval

    def buildHeap(self,alist):
      i = len(alist) // 2
      self.currentSize = len(alist)
      self.heapList = [0] + alist[:]
      while (i > 0):
          self.percDown(i)
          i = i - 1

N = int(input())
lst = list(map(int, input().split()))
medians = 0

# minimal value's at the top; any child is bigger than its parent
min_heap = MinHeap()
# conversely
max_heap = MaxHeap()

# initial first values for each tree
if lst[0] > lst[1]:
    min_heap.insert(lst[0])
    max_heap.insert(lst[1])
    medians += lst[0]+lst[1]
else:
    min_heap.insert(lst[1])
    max_heap.insert(lst[0])
    medians += 2*lst[0]

# then the same procedure of the rest
for i in range(2, N):
    if lst[i] < max_heap.heapList[1]:
        max_heap.insert(lst[i])
    else:
        min_heap.insert(lst[i])
    # if the difference of size is bigger than one then balance
    # the trees moving root of the biggest tree in another one
    if min_heap.currentSize-max_heap.currentSize > 1:
        max_heap.insert(min_heap.delMin())
    elif max_heap.currentSize-min_heap.currentSize > 1:
        min_heap.insert(max_heap.delMax())
    # if the length is even we take len/2-th element; odd ==> (len+1)/2
    if max_heap.currentSize >= min_heap.currentSize:
        medians += max_heap.heapList[1]
    else:
        medians += min_heap.heapList[1]

print(medians)