MongoDB的reduce-phase没有按预期工作

时间:2014-03-12 13:49:38

标签: java mongodb mapreduce

我在MongoDB中使用了一个用于mapReduce-Programming的java教程,最后得到了以下代码:

package mapReduceExample;

import com.mongodb.BasicDBObject;
import com.mongodb.DB;
import com.mongodb.DBCollection;
import com.mongodb.DBObject;
import com.mongodb.MapReduceCommand;
import com.mongodb.MapReduceOutput;
import com.mongodb.Mongo;

public class MapReduceExampleMain {

    /**
     * @param args
     */
    public static void main(String[] args) {

        Mongo mongo;

        try {
            mongo = new Mongo("localhost", 27017);
            DB db = mongo.getDB("library");

            DBCollection books = db.getCollection("books");

            BasicDBObject book = new BasicDBObject();
            book.put("name", "Understanding JAVA");
            book.put("pages", 100);
            books.insert(book);

            book = new BasicDBObject();
            book.put("name", "Understanding JSON");
            book.put("pages", 200);
            books.insert(book);

            book = new BasicDBObject();
            book.put("name", "Understanding XML");
            book.put("pages", 300);
            books.insert(book);

            book = new BasicDBObject();
            book.put("name", "Understanding Web Services");
            book.put("pages", 400);
            books.insert(book);

            book = new BasicDBObject();
            book.put("name", "Understanding Axis2");
            book.put("pages", 150);
            books.insert(book);

            String map = "function()"
                    + "{ "
                        + "var category; "
                        + "if ( this.pages > 100 ) category = 'Big Books'; "
                        + "else category = 'Small Books'; "
                        + "emit(category, {name: this.name});"
                    + "}";

            String reduce = "function(key, values)"
                    + "{"
                        + "return {books: values.length};"
                    + "} ";

            MapReduceCommand cmd = new MapReduceCommand(books, map, reduce,
                    null, MapReduceCommand.OutputType.INLINE, null);

            MapReduceOutput out = books.mapReduce(cmd);

            for (DBObject o : out.results()) {
                System.out.println(o.toString());
            }

            //aufräumen
            db.dropDatabase();

        } catch (Exception e) {
            // TODO Auto-generated catch block
            e.printStackTrace();
        }



    }
}

这是一个非常简单的减少阶段,但它不是我想要的:(

输出结果为:

{ "_id" : "Big Books" , "value" : { "books" : 4.0}}
{ "_id" : "Small Books" , "value" : { "name" : "Understanding JAVA"}}

我希望如此:

{ "_id" : "Big Books" , "value" : { "books" : 4.0}}
{ "_id" : "Small Books" , "value" : { "books" : 1.0}}

为什么reduce-Phase在小书的情况下不会返回values.length?

问候,安德烈

2 个答案:

答案 0 :(得分:1)

因为如果只有一个结果,则永远不会运行reduce。将其更改为finalize函数或其他内容。

答案 1 :(得分:1)

对mapReduce如何工作的基本理解


让我们介绍mapReduce

的概念
  • mapper - 这是发出要输入reduce阶段的数据的阶段。它需要才能发送。如果您想要在映射器中,可以多次发出,但要求保持不变。

  • reducer - 当给定键的多个值处理已经发出的值列表时,会调用reducer 该密钥。


也就是说,由于映射器仅发出一个键值,因此未调用 reducer

您可以在finalise中清除此内容,但直接通过映射器发射的行为是标准设计。