Question

正如标题所说。我正在阅读Yet Another Language Geek: Continuation-Passing Style，我有点想知道MapReduce是否可以被归类为Continuation-Passing Style又称CPS的一种形式。

我也想知道CPS如何利用多台计算机来执行复杂的计算。也许CPS可以更轻松地使用Actor model。

Answer 1

我不会这么说。 MapReduce确实执行用户定义的函数，但这些函数更好地称为“回调”。我认为CPS是一个非常抽象的概念，通常用于模拟诸如函数，协同程序，回调和循环等更为人熟知的概念的行为。它通常不直接使用。

然后，我可能会将CPS与继续本身混淆。我不是任何一位专家。

Answer 2

我会说他们是对立的。 MapReduce显然适用于分布，Map可以独立完成子任务。使用CPS，您可以编写一个递归函数，其中每个调用都在等待较小的情况下返回。

我认为CPS是Guy Steele所描述的编程技巧之一，在他关于The Future of Parallel: What's a Programmer to do?

的演讲中我们需要超越和忘记的事情。

Answer 3

CPS和MapReduce都使用更高阶的函数。这意味着两者都涉及将函数作为参数的函数。

在CPS的情况下，你有一个函数（称为延续），其中一个参数说明如何处理结果。通常（但不总是）延续使用一次。它是一个函数，指定整个计算的其余部分应该如何继续。这也使它成为一种连续的东西。通常，您有一个执行线程，而continuation指定它将如何继续。

在MapReduce的情况下，您提供多次使用的函数参数。这些参数函数并不真正代表整个计算的其余部分，而只是一点一点地使用的构建块。 “反复”位通常可以分布在多台机器上，使其成为一种并行的东西。

所以你看到共性是正确的。但其中一个并不是另一个的例子。

Answer 4

Map-reduce是一种实现。允许您使用该实现的编码接口可以使用continuation;这实际上是框架和工作控制如何被抽象的问题。考虑Hadoop的声明性接口，例如Pig，或者一般的声明性语言，例如SQL;接口下面的机器可以通过多种方式实现。

例如，这是一个抽象的Python map-reduce实现：

def mapper(input_tuples):
    "Return a generator of items with qualifying keys, keyed by item.key"
    # we are seeing a partition of input_tuples
    return (item.key, item) for (key, item) in input_items if key > 1)

def reducer(input_tuples):
    "Return a generator of items with qualifying keys"
    # we are seeing a partition of input_tuples
    return (item for (key, item) in input_items if key != 'foo')

def run_mapreduce(input_tuples):
    # partitioning is magically run across boxes
    mapper_inputs = partition(input_tuples)
    # each mapper is magically run on separate box
    mapper_outputs = (mapper(input) for input in mapper_inputs)
    # partitioning and sorting is magically run across boxes
    reducer_inputs = partition(
        sort(mapper_output for output in mapper_outputs))
    # each reducer is magically run on a separate box
    reducer_outputs = (reducer(input) for input in reducer_inputs)

这是使用协同程序的相同实现，隐藏了更多神奇的抽象：

def mapper_reducer(input_tuples):
    # we are seeing a partition of input_tuples
    # yield mapper output to caller, get reducer input
    reducer_input = yield (
        item.key, item) for (key, item) in input_items if key > 1)
    # we are seeing a partition of reducer_input tuples again, but the
    # caller of this continuation has partitioned and sorted
    # yield reducer output to caller
    yield (item for (key, item) in input_items if key != 'foo')

MapReduce是一种延续传递风格（CPS）吗？

4 个答案: