我正在练习使用mapper,combiner和reducer方法在Python中编写一些简单的MapReduce代码。输入数据为(StudentID,grade)的元组,而MapReduce Job的输出应为学生的平均成绩。
我想知道我对合并器和缩减器方法的使用是否正确。是否有任何明显的错误?
任何反馈都非常感谢。
#MapReduce Code that computes average
#input: (studentID, grade)
#Output: compute module average
#Emit key,value pairs
def mapper(self, _, tuple):
elements = tuple.split("")
studentGrade = elements[1]
yield(studentGrade,1)
#Combiner should do some aggregate computation
#Say key is 100, and values = 1,1,1,1
def combiner(self,grade,values):
count=0
sum=0
for i in values:
count = count + i
sum = sum + grade
#At the end of the for loop sum=400 and count=4
yield(sum,count)
#Combine aggregate and compute average
def reducer(self,grade,value):
count=0
sum=0
for i in value:
count=count+i
sum=sum+grade
average = sum/count
yield("average",average)