Spring Data MongoDB和allowDiskUse

时间:2014-11-10 23:27:40

标签: java spring mongodb

我有这样的查询:

db.tqaP.aggregate([
            {$match : { $and: [
                                {"eventUTCDate" : {
                                                    $gte : '01-10-2014'
                                                  }
                                }, 
                    {"eventUTCDate" : {
                                                    $lt : '31-10-2014'
                                                  }
                                }, 
                                {"mpTransactionId":{
                                                    $exists: true
                                                   }
                                },
                                {testMode : false},
                                {eventID : {
                                            $in : [
                                                    230, // ContentDiscoveredEvent
                                                    204, // ContentSLAStartEvent
                                                    211, // ContentProcessedEndEvent
                                                    255, // ContentValidationStatusEvent
                                                    256, // ContentErrorEvent
                                                    231, // ContentAnalyzedEvent
                                                    240, // ContentTranscodeStartEvent
                                                    241, // ContentTranscodeEndEvent
                                                    252  // AbortJobEvent
                                                    //205, 207
                                                  ]
                                            }
                                }
                        ]}}, 
          {$project : 
                        {
                            _id:0,
                            event : {
                                eventID                 : "$eventID",
                                eventUTCDate            : "$eventUTCDate", 
                                processState            : "$processState", 
                                jobInstanceId           : "$jobInstanceId", 
                                mpTransactionId         : "$mpTransactionId",
                                eventUID                : "$eventUID",
                                contextJobInstanceId    : "$context.jobInstanceId", 
                                contextValidationStatus : "$context.validationStatus", 
                                metaUpdateOnly          : "$metaUpdateOnly", 
                                errorCode               : "$errorCode",
                                transcodingProfileName  : "$transcodingProfileName",
                                contextAssetId          : "$context.assetId"
                            }
                        }
          },
          // Creating the hash map <mpTransactionId, listOfAssociatedEvents>
          {$group   :     {
                            "_id"               : "$event.mpTransactionId", 
                            "chainOfEvents"     : {$addToSet : "$event"}
                          },
          },
          // Sorting by chainOfEvents.eventUTCDate
          {$unwind      : "$chainOfEvents"}, 
          {$sort        : {
                            "chainOfEvents.eventUTCDate":1
                          }
          },
          {$group       : {
                            _id :"$_id", 
                            chainOfEvents: {
                                                $push:"$chainOfEvents"
                                           }
                          }
          }
       ])

运行超过1.2百万条记录并死亡。错误消息是

assert: command failed: {
        "errmsg" : "exception: Sort exceeded memory limit of 104857600 bytes, but did not opt in to external sorting. Aborting operation. Pass allowDi
skUse:true to opt in.",
        "code" : 16819,
        "ok" : 0
} : aggregate failed

我通过在最后的右括号(方形和圆形)之间添加

来解决这个问题
,{allowDiskUsage: true}

现在我尝试使用Spring Data for MongoDB来表达同样的事情,我的Java代码如下:

MatchOperation match = Aggregation.match( new Criteria()
                            .andOperator(
                                        Criteria.where("eventUTCDate").gte(startDateAsString),
                                        Criteria.where("eventUTCDate").lt(endDateAsString))
                            .and("mpTransactionId").exists(true)
                            .and("testMode").is(false)
                            .and("eventID").in(230, 204, 211, 255, 256, 231, 240, 241, 252) );

    ProjectionOperation projection = Aggregation.project().and("event").
                                nested(bind("eventID", "eventID").
                                        and("eventUTCDate", "eventUTCDate").
                                        and("processState", "processState").
                                        and("jobInstanceId", "jobInstanceId").
                                        and("mpTransactionId", "mpTransactionId").
                                        and("eventUID", "eventUID").
                                        and("contextJobInstanceId", "context.jobInstanceId").
                                        and("contextValidationStatus", "context.validationStatus").
                                        and("metaUpdateOnly", "metaUpdateOnly").
                                        and("errorCode", "errorCode").
                                        and("transcodingProfileName", "transcodingProfileName").
                                        and("contextAssetId", "context.assetId"));

    GroupOperation group = Aggregation.group("event.mpTransactionId").addToSet("event").as("chainOfEvents");

    UnwindOperation unwind = Aggregation.unwind("chainOfEvents");

    SortOperation sort = Aggregation.sort(Sort.Direction.ASC, "chainOfEvents.eventUTCDate");

    GroupOperation groupAgain = Aggregation.group("_id").push("chainOfEvents").as("eventsList");


    Aggregation agg = newAggregation(Event.class, match,  projection, group, unwind, sort, groupAgain).withOptions(Aggregation.newAggregationOptions().allowDiskUse(true).build());
    AggregationResults<EventsChain> results = mongoOps.aggregate(agg, "tqaP", EventsChain.class);

但我收到一组空结果。此查询适用于较小的数据集。 我刚刚添加了

.withOptions(Aggregation.newAggregationOptions().allowDiskUse(true).build());

为了适应数据的大小。任何人都可以告诉我使用不正确吗?

我正在使用MongoDB 2.6.4和Spring-Data-MongoDB版本1.6.1-RELEASE。

1 个答案:

答案 0 :(得分:0)

这是使用MongoTemplate类帮助程序的有效解决方案2.1.8。

AggregationOptions options = AggregationOptions.builder().allowDiskUse(true).build();
List<AggregationOperation> aggs = Arrays.asList(m1, p1, g1);
        mongoTemplate.aggregate(Aggregation.newAggregation(aggs).withOptions(options), inputCollectionName, Document.class);