reduce()在MRJob中没有mapper()的情况下做什么?

时间:2015-04-26 07:36:13

标签: python-2.7 hadoop mrjob

我是python的新手,并尝试按照http://www.yekeren.com/blog/archives/1005

指令构建推荐系统 让我困惑的是:

def reducer3_init(self):
        self.pop = { }

        file = open(self.options.item_pop, "r")

        for line in file.readlines():
            movieid_jstr, pop_jstr = line.strip().split("\t")
            movieid = json.loads(movieid_jstr)
            pop     = json.loads(pop_jstr)

            self.pop[movieid] = pop
        file.close()

    def reducer3(self, key, values):
        yield key, sum(values) / math.sqrt(self.pop[key[0]] * self.pop[key[1]])

reduce3没有对应的mapper,它是如何执行的?和json.load()做什么?    非常感谢!!

1 个答案:

答案 0 :(得分:0)

documentation说:

  

class mrjob.step.MRStep(** kwargs)

Used by MRJob.steps. See Multi-step jobs for sample usage.
     

接受以下关键字参数。参数:

mapper – function with same function signature as mapper(), or None for an identity mapper.

这是一种习惯用法,即地图默认为标识并缩小为扁平化。