删除连续的子阵列以保留平均最小值

时间:2015-09-05 07:40:42

标签: algorithm language-agnostic

这个问题出现在ICPC的一些区域竞赛中。

给定n个数字,你必须删除i到j之间的数字,这样剩下的数字的平均值最小。您无法删除第一个和最后一个数字。

2< = n< = 10 ^ 5

我们讨论过它,我仍然无法理解它。一些如何将这个问题转换为找到具有最大和的连续子阵列然后通过O(nlog n)中的二进制搜索来解决。

在讨论时我无法抓住这个解决方案,现在经过深思熟虑之后,我无法理解这个解决方案。

如果不清楚,请链接到原始问题:http://programmingteam.cc.gatech.edu/contest/Mercer14/problems/6.pdf

2 个答案:

答案 0 :(得分:0)

以下是我认为可行的方法:

  • 计算所有元素的左边部分平均值,并更新平均值,这可以在O(N)中完成:a_L(i)=(a_L(i-1)*(i-1)+ A_L(I))/ I

  • 对右边的部分平均值做同样的事情:a_R(i)=(a_R(i + 1)*(Ni)+ a_R(i))/(N-i + 1)

  • 找到两个列表中的最小值。

  • 如果最小值在左侧偏平均值(a_L)中,请在a_R中查找最小权限,如果在a_R中找到最小值,则相反。

所有部分都采用O(N)。因此,这将导致O(N)算法。虽然,听起来有点简单,但我可能会遗漏一些东西。

编辑:两个列表中间的原始答案都停止了,这是不够的。

实际上,如果最小值重叠,我相信,没有间隔可以减少。这是算法的一个小Python实现:

grades = [5, 5, 1, 7, 8, 2]

N = len(grades)
glob_avg = float(sum(grades))/float(N)
print('total average: {0}'.format(glob_avg))

avg_L = grades[:]
avg_R = grades[:]

minL = 0
minR = N-1

for i in range(1,N):
  avg_L[i] = float(avg_L[i-1]*i + grades[i])/float(i+1)
  if avg_L[i] <= avg_L[minL]:
    minL = i
  avg_R[N-i-1] = float(avg_R[N-i]*i + grades[N-i-1])/float(i+1)
  if avg_R[N-i-1] <= avg_R[minR]:
    minR = N-i-1

opti_avg = glob_avg
if minL < minR:
  first = minL+1
  last = minR
  opti_avg = (avg_L[first-1]*first + avg_R[last]*(N-last)) / float(N + first - last)
  print('')
  print('Interval to cut: {0} - {1}'.format(first,last))
  for pre in grades[:first]:
    print('{0}'.format(pre))
  for cut in grades[first:last]:
    print('X {0} X'.format(cut))
  for post in grades[last:]:
    print('{0}'.format(post))

else:
  print('NO interval found that would reduce the avg!')

print('')
print('--------------------------------------')
print('minimal avg: {0:0.3f}'.format(opti_avg))
print('--------------------------------------')

答案 1 :(得分:0)

我会尝试检查全局最小值以上的每个值,从最大值开始。

只要平均值高于全球平均值,您就可以向左或向右添加(以最大者为准)。

记下剩余物品的最低限度。

For each item >= global average
   While( average( selected) > global average
      If average(un selected items) < best so far
         Best so far = selected range
      End
      Add to selection largest of left and right
   End while
End for

只有找到高于平均值的序列才能达到未选择的工作的最小值。

任何被视为列表的项目都可以打折

在Python中实现: -

lst = [ -1, -1,1,-90,1,3,-1,-1,1,2,3,1,2,3,4,1, -1,-1];

第一个解决方案 - 真的看一下详尽的测试 - 让我验证正确性。

lbound = 0
ubound = len( lst)

print( ubound );

# from http://math.stackexchange.com/questions/106700/incremental-averageing

def Average( lst, lwr, upr, runAvg = 0, runCnt = 0 ):
    cnt = runCnt;
    avg = runAvg;
    for i in range( lwr, upr ):
        cnt = cnt + 1
        avg = float(avg) + (float(lst[i]) - avg)/cnt
    return (avg, cnt )

bestpos_l = 0
bestpos_u = 0
bestpos_avg = 0
best_cnt = 0
######################################################
# solution in O(N^2) - works always
for i in range( 1, len( lst ) - 1 ):
    for j in range( i+1, len(lst ) ):
        tpl = Average( lst, 0, i ) # get lower end
        res = Average( lst, j, len(lst), tpl[0], tpl[1] )
        if (best_cnt == 0 or
              (best_cnt < res[1] and res[0] == bestpos_avg ) or
               res[0] < bestpos_avg ):
            bestpos_l = i
            bestpos_u = j
            bestpos_avg = res[0]
            best_cnt = res[1]
            print( "better", i,j, res[0], res[1] )

print( "solution 1", bestpos_l, bestpos_u, bestpos_avg, best_cnt )

这提出了有效的答案,但我并不欣赏,根据目前的数据集,它并不真正想要右手边。

########################################################
# O(N)
#
# Try and minimize left/right sides.
#
# This doesn't work - it knows -90 is really good, but can't decide if to
# ignore -90 from the left, or the right, so does neither.
# 
lower = []
upper = []
lower_avg = 0
best_lower = lst[0]
lower_i = 0
best_upper = lst[-1]
upper_avg = 0
upper_i = len(lst) -1
cnt = 0

length = len(lst)
for i in range( 0, length ):
    cnt = cnt + 1
    lower_avg = float( lower_avg) + ( float(lst[i]) - lower_avg)/cnt
    upper_avg = float( upper_avg) + ( float(lst[-(i+1)]) - upper_avg)/cnt
    upper.append( upper_avg )
    lower.append( lower_avg )
    if lower_avg <= best_lower:
        best_lower = lower_avg
        lower_i = i
    if upper_avg <= best_upper:
        best_upper = upper_avg
        upper_i = (len(lst) - (i+1))
if( lower_i + 1 > upper_i ):
    sol2 = Average( lst,0, len(lst ))
else:
    sol_tmp = Average( lst,0, lower_i+1 )
    sol2 = Average( lst, upper_i, len(lst),sol_tmp[0],sol_tmp[1] )

print( "solution 2", lower_i + 1, upper_i, sol2[0],sol2[1] )

第三个解决方案是我试图解释的。我的实施有限,因为: -

  1. 无法找到找到起点的好方法。我想从最大的元素开始,因为它们最有可能降低平均值,但是找不到它们的好方法。
  2. 不确定保持平均运行的稳定性。考虑通过取消每个数字效果从平均值中删除项目。我不确定这会如何影响精确度。
  3. 相当确定任何已检查的间隔都不能有起始项目。这将限制进一步的工作,但不确定如何最好地实现这一点(将O(xx)保持在最低限度。
  4. 解决方案3

    #################################
    ## can we remove first / last? if so, this needs adjusting
    
    def ChooseNext( lst, lwr, upr ):
        if lwr > 1 and upr < len(lst) -2:
           # both sides available.
           if lst[lwr-1] > lst[upr]:
              return -1
           else:
              return 1
        elif lwr > 1:
          return -1
        elif upr < len(lst) -2:
          return 1
       return 0
    
    
    # Maximize average of data removed.
    glbl_average = Average( lst, 0, len(lst) )
    found = False
    min_pos = 0
    max_pos = 0
    best_average = glbl_average[0]
    
    for i in range(1, len(lst ) - 1):
        # ignore stuff below average.
        if lst[i]> glbl_average[0] or (found == False ):
            lwr = i
            upr = i+1
            cnt = 1 # number for average
            avg = lst[i]
            tmp = Average( lst, 0, lwr)
            lcl = Average( lst, upr, len(lst ), tmp[0], tmp[1] )
            if found == False or lcl[0] < best_average:
               best_average = lcl[0]
               min_pos = lwr
               max_pos = upr
               found = True
    
            # extend from interval (lwr,upr]
            choice = ChooseNext( lst, lwr, upr )
            while( choice != 0 ):
                if( choice == -1):
                    new_lwr = lwr -1
                    new_upr = upr
                else:
                    new_lwr = lwr
                    new_upr = upr + 1
                tmp = Average( lst, 0, new_lwr )
                lcl_best = Average( lst, new_upr, len(lst), tmp[0], tmp[1] )
                if( lcl_best[0] > glbl_average[0]):
                    choice = 0
                else:
                    lwr = new_lwr
                    upr = new_upr
                    if lcl_best[0] < best_average:
                       min_pos = lwr
                       max_pos = upr
                       best_average = lcl_best[0]
                    choice = ChooseNext( lst, lwr, upr )
    

    打印(&#34;解决方案3&#34;,min_pos,max_pos,best_average)