Question

我正在测试一个简单的mapreduce应用程序，但是当我迭代reduce调用的输入值时，我试图理解发生了什么。

这是一段表现奇怪的代码。

public void reduce(Text key, Iterable<E> values, Context context)
    throws IOException, InterruptedException{

    Iterator<E> iterator = values.iterator();
    E first = (E)statesIter.next();

    while(statesIter.hasNext()){
        E state = statesIter.next();

        System.out.println(first.toString());
        // some other stuff
    }
    // some other stuff
}

所以没什么奇怪的..除了每个println调用实际打印不同的字符串这一事实。因此，每次调用next()方法时，first引用的对象都会发生变化。

为什么这种奇怪的行为？

Answer 1

这有点违反直觉，但它实际上是documented in the API docs - Hadoop重用键/值，如果你想保留它们，你应该克隆它们。

Hadoop MapReduce迭代reduce调用的输入值

1 个答案: