MapReduce计算平均值

时间:2019-04-11 16:06:20

标签: python mapreduce

我正在练习使用mapper,combiner和reducer方法在Python中编写一些简单的MapReduce代码。输入数据为(StudentID,grade)的元组,而MapReduce Job的输出应为学生的平均成绩。

我想知道我对合并器和缩减器方法的使用是否正确。是否有任何明显的错误?

任何反馈都非常感谢。

#MapReduce Code that computes average
#input: (studentID, grade)
#Output: compute module average

#Emit key,value pairs
def mapper(self, _, tuple):
 elements = tuple.split("")
 studentGrade = elements[1]
 yield(studentGrade,1)

#Combiner should do some aggregate computation
#Say key is 100, and values = 1,1,1,1
def combiner(self,grade,values):
   count=0
   sum=0
   for i in values:
     count = count + i
     sum = sum + grade
     #At the end of the for loop sum=400 and count=4
   yield(sum,count)

#Combine aggregate and compute average
def reducer(self,grade,value):
  count=0
  sum=0
  for i in value:
    count=count+i
    sum=sum+grade
    average = sum/count
    yield("average",average)

0 个答案:

没有答案
相关问题