如何在AppEngine MapReduce中获得计数器结果?

时间:2014-05-21 05:09:46

标签: python google-app-engine mapreduce

我正在使用谷歌mapreduce lib来处理我的数据。在处理数据时,计数器可用于映射器函数。但我不知道如何在最终方法中得到计数器结果。

def mapper(obj):
    yield obj
    yield operation.counters.Increment("process-obj")


class Test(base_handler.PipelineBase):
    """A pipeline to ingest log as CSV in Google Storage
    """

    def run(self, setting_id):
        filepath = yield mapreduce_pipeline.MapperPipeline(
            "test",
            "mapper",
            "mapreduce.input_readers.DatastoreInputReader",
            output_writer_spec="mapreduce.output_writers.FileOutputWriter",
            params={

            },
            shards=10
        )
    def finalized(self):
        # how to read the counter process-obj
        # how to get the setting_id
        pass

1 个答案:

答案 0 :(得分:2)

命名输出可能就是你要找的东西。您可以找到更多详细信息here

以下是使用命名输出的代码,以获取各种计数器,包括您定义的计数器:

def mapper(obj):
    yield obj
    yield operation.counters.Increment("process-obj")


class Test(base_handler.PipelineBase):
    """A pipeline to ingest log as CSV in Google Storage
    """

    output_names = ['counters']

    def run(self, setting_id):
        results = yield mapreduce_pipeline.MapperPipeline(
          "test",
          "mapper",
          "mapreduce.input_readers.DatastoreInputReader",
          output_writer_spec="mapreduce.output_writers.FileOutputWriter",
          params={

          },
          shards=10
        )
        yield MapreduceResult(results.counters)

    def finalized(self):
        print 'Counters here: ', self.outputs.counters


class MapreduceResult(base_handler.PipelineBase):

    def run(self, counters):
        self.fill(self.outputs.counters, counters)