如何迭代一组对象以最有效的方式找到它们的平均值?这只使用一个循环(除了Numpy中的循环),但我想知道是否有更好的方法。目前,我这样做:
scores = []
ratings= []
negative_scores = []
positive_scores = []
for t in text_collection:
scores.append(t.score)
ratings.append(t.rating)
if t.score < 0:
negative_scores.append(t.score)
elif t.score > 0:
positive_scores.append(t.score)
print "average score:", numpy.mean(scores)
print "average rating:", numpy.mean(ratings)
print "average negative score:", numpy.mean(negative_scores)
print "average positive score:", numpy.mean(positive_scores)
有更好的方法吗?
答案 0 :(得分:5)
import numpy as np
scores, ratings = np.array([(t.score, t.rating) for t in text_collection]).T
print 'average score: ', np.mean(scores)
print 'average rating: ', np.mean(ratings)
print 'average positive score: ', np.mean(scores[scores > 0])
print 'average negative score: ', np.mean(scores[scores < 0])
编辑:
要检查是否确实存在任何负面分数,您可以这样:
if np.count_nonzero(scores < 0):
print 'average negative score: ', np.mean(scores[scores < 0])
答案 1 :(得分:1)
您是否介意为要从集合中获取的每个项目进行循环?效率稍差,但更清晰:
avg_score = numpy.mean([t.score for t in text_collection])
avg_rating = numpy.mean([t.rating for t in text_collection])
avg_neg_score = numpy.mean([t.rating for t in text_collection if t.score < 0])
avg_pos_score = numpy.mean([t.rating for t in text_collection if t.score > 0])
答案 2 :(得分:0)
如果您有NumPy,我认为这是您最好的选择。它完全符合您的要求,并且具有自我记录您正在做的事情的名称。
如果你想要一个纯python解决方案:
def mean(seq):
i = 0
sum = 0.0
for x in seq:
sum += x
i += 1
if i == 0:
raise ValueError, "cannot take mean of zero-length sequence"
return sum / i
我写了这个来处理任何序列,包括计算值的生成器表达式之类的东西。因此它只运行一次序列,它保留自己的计数器,因此它知道有多少。如果你确定你只想知道列表的平均值:
def list_mean(lst):
if len(lst) == 0:
raise ValueError, "cannot take mean of zero-length list"
return float(sum(lst)) / len(lst)
如果在迭代器或生成器表达式上调用它,len()
将无效,您将获得TypeError
例外。
答案 3 :(得分:0)
你可以通过简单的操作从avg_neg_score和avg_pos_score获得avg_score:
nneg = len(negative_scores)
npos = len(positive_scores)
avg_score = (avg_neg_score * nneg + avg_pos_score * npos) / (nneg + npos)
编辑:如果你通过迭代text_collection创建数组,这将更有效(假设你只想要手段):
n = len(text_collection)
(npos, sumpos) = (0, 0)
(nneg, sumneg) = (0, 0)
sumrating = 0
for t in text_collection:
sumrating += t.rating
if t.score < 0:
sumneg += t.score
nneg += 1
else:
sumpos += t.score
npos += 1
avg_score = (sumneg + sumpos) / n
avg_neg_score = sumneg / nneg
avg_pos_score = sumpos / npos
avg_rating = sumrating / n
edit2:fixed:avg_neg_rating to avg_neg_score ...