Question

我正在尝试从 java 运行mongo db聚合查询，但缓冲区大小超过16MB。有没有办法调整缓冲区大小或任何其他解决方法。我没有在mongo服务器端创建集合的选项，也没有在我的客户端系统中有任何mongo实用程序，如mongo.exe或mongoExport.exe。

这是代码的一小部分

if (!datasetObject?.isFlat && jsonFor != 'collection-grid'){
   //mongoPipeline = new AggregateArgs (Pipeline = pipeline, AllowDiskUse = true, OutputMode = AggregateOutputMode.Cursor)
   output= dataSetCollection.aggregate(pipeline)
}else{
     output= dataSetCollection.aggregate(project)
    }

我有30个字段的100K记录。当我查询所有100K记录的 5个字段时，我得到结果（成功）。但是当我查询所有字段的100K记录时，它会抛出错误。

问题是当我尝试访问集合中的所有文档时，包括文件的所有字段超过16Mb的限制大小。

实际错误：

com.mongodb.CommandFailureException: { "serverUsed" : "127.0.0.1:27017" , "errmsg" : "exception: aggregation result exceeds maximum document size (16MB)" , "code" : 16389 , "ok" : 0.0

如何解决此问题？

使用MongoDB-3.0.6

注意：GridFS不适合我的条件。因为我需要在一个请求中检索所有文档而不是一个文档。

Answer 1

运行聚合时，您可以告诉mongo返回游标。使用3.0 Java驱动程序中的新API，如下所示：

// Assuming MongoCollection
dataSetCollection.aggregate(pipeline).useCursor(true)

您可能还需要告诉它使用服务器上的磁盘空间而不是在内存中完成所有操作：

// Assuming MongoCollection
dataSetCollection.aggregate(pipeline).useCursor(true).allowDiskUse(true)

如果你使用的是较旧的驱动程序（或新驱动程序中的旧API），那么这两个选项将如下所示：

// Assuming DBCollection
dataSetCollection.aggregate(pipeline, AggregationOptions.builder()
    .allowDiskUse(true)
        .useCursor(true)
        .build())
    .useCursor(true)

Answer 2

有两种方法可以解决此问题

1）使用$out创建新的集合和写入结果，这不是一个好主意，因为这个过程耗时且实现起来很复杂。

public class JavaAggregation {
public static void main(String args[]) throws UnknownHostException {

    MongoClient mongo = new MongoClient();
    DB db = mongo.getDB("databaseName");

    DBCollection coll = db.getCollection("dataset");

    /*
        MONGO SHELL : 
        db.dataset.aggregate([ 
            { "$match": { isFlat : true } }, 
            { "$out": "datasetTemp" }
        ])
    */

    DBObject match = new BasicDBObject("$match", new BasicDBObject("isFlat", true)); 
    DBObject out = new BasicDBObject("$out", "datasetTemp"); 

    AggregationOutput output = coll.aggregate(match, out);

    DBCollection tempColl = db.getCollection("datasetTemp");
    DBCursor cursor = tempColl.find();

    try {
        while(cursor.hasNext()) {
            System.out.println(cursor.next());
        }
    } finally {
        cursor.close();
    }
 }
}

2。 allowDiskUse(true)的使用非常简单，甚至不费时。

public class JavaAggregation {
public static void main(String args[]) throws UnknownHostException {

    MongoClient mongo = new MongoClient();
    DB db = mongo.getDB("databaseName");

    DBCollection coll = db.getCollection("dataset");

    /*
        MONGO SHELL : 
        db.dataset.aggregate([ 
            { "$match": { isFlat : true } }, 
            { "$out": "datasetTemp" }
        ])
    */

    DBObject match = new BasicDBObject("$match", new BasicDBObject("isFlat", true)); 
    def dbObjArray = new BasicDBObject[1]
    dbObjArray[0]= match
    List<DBObject> flatPipeline = Arrays.asList(dbObjArray)

    AggregationOptions aggregationOptions = AggregationOptions.builder()
                                    .batchSize(100)
                                    .outputMode(AggregationOptions.OutputMode.CURSOR)
                                    .allowDiskUse(true)
                                    .build();
    def cursor = dataSetCollection.aggregate(flatPipeline,aggregationOptions)
    try {
        while(cursor.hasNext()) {
            System.out.println(cursor.next());
        }
    } 
    finally {
        cursor.close();
    }
}

有关详情，请参阅here和here

Java / Grails - MongoDB聚合16MB缓冲区大小限制

2 个答案: