一旦条件失败就停止找到平均值?

时间:2013-08-21 04:25:19

标签: python

请考虑以下代码:

 sub = [767220, 769287, 770167, 770276, 770791, 770835, 771926, 1196500, 1199789,      1201485, 1206331, 1206467, 1210929, 1213184, 1213204, 1213221, 1361867, 1361921, 1361949, 1364886, 1367224, 1368005, 1368456, 1368982, 1369000, 1370365, 1370434, 1370551, 1371492, 1471407, 1709408, 1710264, 1710308, 1710322, 1710350, 1710365, 1710375]
 avg = []; final = [] 

 def runningMean(seq, n=0, total=0): #function called recursively
       if not seq:
         return []
       total =total+int(seq[-1])
       return runningMean(seq[:-1], n=n+1, total=total) + [total/float(n+1)]

 def main():

    avg = runningMean(sub,n = 0,total = 0) #function call to obtain running mean starting from last element in the list i,e 1710375
    print avg
    for i in range(len(sub)):
      if (int(sub[i]) > float(avg[i] * 0.9)): #checking the condition
         final.append(sub[i])
    print final


 if __name__ == '__main__':
       main()

输出包括runningmean& amp;子列表不满足条件:

  [1282960.6216216215, 1297286.75, 1312372.4571428571, 1328319.6764705882, 1345230.0909090908, 1363181.3125, 1382289.2580645161, 1402634.7, 1409742.7931034483, 1417241.142857143, 1425232.111111111, 1433651.3846153845, 1442738.76, 1452397.5, 1462798.0869565217, 1474143.2727272727, 1486568.142857143, 1492803.2, 1499691.7368421052, 1507344.111111111, 1515724.0, 1525005.25, 1535471.9333333333, 1547401.642857143, 1561126.2307692308, 1577136.75, 1595934.1818181819, 1618484.2, 1646032.3333333333, 1680349.875, 1710198.857142857, 1710330.6666666667, 1710344.0, 1710353.0, 1710363.3333333333, 1710370.0, 1710375.0]

  [1361867, 1361921, 1361949, 1364886, 1367224, 1709408, 1710264, 1710308, 1710322, 1710350, 1710365, 1710375]

我需要做的是一旦条件失败就应该停止发现平均值

(sub[i] > float(avg[i] * 0.9))

我,结果应该是:

  [1680349.875, 1710198.857142857, 1710330.6666666667, 1710344.0, 1710353.0, 1710363.3333333333, 1710370.0, 1710375.0]
  [1709408, 1710264, 1710308, 1710322, 1710350, 1710365, 1710375]

如果有人可以在python中建议一个解决方案,那将会有所帮助。

3 个答案:

答案 0 :(得分:1)

我建议将平均计算器重新实现为生成器。生成器只计算在迭代时产生下一个值所需的数量。如果你提前停止迭代,其余的计算将不会完成。

此外,设计代码以向前迭代而不是向后迭代要容易得多。如果需要向后移动,请使用reversed函数获取反向迭代器,或在列表中调用reverse方法。

这是一个计算累积平均值的发电机(向前方向,而不是向后方向):

def runningMean(iterable):
    """A generator, yielding a cumulative average of its input."""
    num = 0
    denom = 0
    for x in iterable:
        num += x
        denom += 1
        yield num / denom

要获得所需的反向累积平均值,您需要在原始数据的reversed迭代器上使用它:

>>> sub = [767220, 769287, 770167, 770276, 770791, 770835, 771926, 1196500, 1199789,      1201485, 1206331, 1206467, 1210929, 1213184, 1213204, 1213221, 1361867, 1361921, 1361949, 1364886, 1367224, 1368005, 1368456, 1368982, 1369000, 1370365, 1370434, 1370551, 1371492, 1471407, 1709408, 1710264, 1710308, 1710322, 1710350, 1710365, 1710375]
>>> list(runningMean(reversed(sub)))
[1710375.0, 1710370.0, 1710363.3333333333, 1710353.0, 1710344.0, 1710330.6666666667, 1710198.857142857, 1680349.875, 1646032.3333333333, 1618484.2, 1595934.1818181819, 1577136.75, 1561126.2307692308, 1547401.642857143, 1535471.9333333333, 1525005.25, 1515724.0, 1507344.111111111, 1499691.7368421052, 1492803.2, 1486568.142857143, 1474143.2727272727, 1462798.0869565217, 1452397.5, 1442738.76, 1433651.3846153845, 1425232.111111111, 1417241.142857143, 1409742.7931034483, 1402634.7, 1382289.2580645161, 1363181.3125, 1345230.0909090908, 1328319.6764705882, 1312372.4571428571, 1297286.75, 1282960.6216216215]

您可以使用list.reverse()方法撤消此操作,如果您希望以与原始输入相同的顺序查看它,但如果您想提前停止计算,我认为您需要保持倒退再多一点。

要在找到比累计平均值高10%以上的值时停止,可以使用itertools.takewhile

import itertools

results = list(itertools.takewhile(lambda x: x[0] > 0.9 * x[1],
                                   itertools.izip(reversed(sub),
                                                  runningMean(reversed(sub)))))

在Python 3中,使用常规zip内置而不是itertools.izip

它为您提供了满足条件的值和平均值的列表,从结束开始并在第一个未通过测试的值之前停止。以下是您可以看到它们的方式:

results.reverse() # put them back in regular order
for value, average in results:
    print value, results

输出:

1709408 1710198.857142857
1710264 1710330.6666666667
1710308 1710344.0
1710322 1710353.0
1710350 1710363.3333333333
1710365 1710370.0
1710375 1710375.0

答案 1 :(得分:0)

sub = [767220, 769287, 770167, 770276, 770791, 770835, 771926, 1196500, 1199789,      1201485, 1206331, 1206467, 1210929, 1213184, 1213204, 1213221, 1361867, 1361921, 1361949, 1364886, 1367224, 1368005, 1368456, 1368982, 1369000, 1370365, 1370434, 1370551, 1371492, 1471407, 1709408, 1710264, 1710308, 1710322, 1710350, 1710365, 1710375]

def runningMean(seq, n=0, total=0): #function called recursively
    if not seq:
        return []
    total = total + int(seq[-1])
    if int(seq[-1]) < total/float(n+1) * 0.9:  # Check your condition to see if it's time to stop averaging.
        return []
    return runningMean(seq[:-1], n=n+1, total=total) + [total/float(n+1)]

avg = runningMean(sub, n = 0, total = 0)

print avg
print sub[-len(avg):]

答案 2 :(得分:0)

要获得预期的跑步平均值,我跑了:

sub.reverse()
avg = runningMean(sub,n = 0,total = 0) #function call to obtain running mean starting from last element in the list i,e 1710375
print avg

下一个比较部分尚不清楚。你能用文字描述算法吗?