我的目标是让我的Map-Reduce作业始终在MongoDB集群的分片的辅助节点上运行。
我将readPreference
设置为辅助,并将MapReduce命令的out
参数设置为inline
以实现此目的。这适用于非分片副本集:作业在辅助副本上运行。但是,在分片群集上,此作业将在“主要”上运行。
有人可以解释为什么会发生这种情况或指向任何相关文档吗?我无法在relevant documentation中找到任何内容。
public static final String mapfunction = "function() { emit(this.custid, this.txnval); }";
public static final String reducefunction = "function(key, values) { return Array.sum(values); }";
...
private void mapReduce() {
...
MapReduceIterable<Document> iterable = collection.mapReduce(mapfunction, reducefunction);
...
}
...
Builder options = MongoClientOptions.builder().readPreference(ReadPreference.secondary());
MongoClientURI uri = new MongoClientURI(MONGO_END_POINT, options);
MongoClient client = new MongoClient(uri);
...
在副本集上执行此操作时从辅助节点记录:
2016-11-23T15:05:26.735 + 0000 I COMMAND [conn671]命令test.txns命令:mapReduce {mapreduce:&#34; txns&#34;,map:function(){emit(this.custid, this.txnval); },reduce:function(key,values){return Array.sum(values); }, out:{inline:1} ,query:null,sort:null,finalize:null,scope:null,verbose:true} planSummary:COUNT keyUpdates:0 writeConflicts:0 numYields:7 reslen :4331 locks:{Global:{acquireCount:{r:44}},Database:{acquireCount:{r:3,R:19}},Collection:{acquireCount:{r:3}}} protocol:op_query 124ms < / p>
Sharded collection:
mongos> db.txns.getShardDistribution()
Shard Shard-0 at Shard-0/primary.shard0.example.com:27017,secondary.shard0.example.com:27017
data : 498KiB docs : 9474 chunks : 3
estimated data per chunk : 166KiB
estimated docs per chunk : 3158
Shard Shard-1 at Shard-1/primary.shard1.example.com:27017,secondary.shard1.example.com:27017
data : 80KiB docs : 1526 chunks : 3
estimated data per chunk : 26KiB
estimated docs per chunk : 508
Totals
data : 579KiB docs : 11000 chunks : 6
Shard Shard-0 contains 86.12% data, 86.12% docs in cluster, avg obj size on shard : 53B
Shard Shard-1 contains 13.87% data, 13.87% docs in cluster, avg obj size on shard : 53B
Shard-0的主要日志:
2016-11-24T08:46:30.828 + 0000 I COMMAND [conn357]命令test。$ cmd命令:mapreduce.shardedfinish {mapreduce.shardedfinish:{mapreduce:&#34; txns&#34;,map:function( ){emit(this.custid,this.txnval); },reduce:function(key,values){return Array.sum(values); }, out:{in line:1} ,query:null,sort:null,finalize:null,scope:null,verbose:true,$ queryOptions:{ $ readPreference:{mode:&#34; secondary&#34; }} },inputDB:&#34; test&#34;, shardedOutputCollection:&#34; tmp.mrs.txns_1479977190_0&#34; ,分片:{Shard-0 / primary。 shard0.example.com:27017,secondary.shard0.example.com:27017:{result:&#34; tmp.mrs.txns_1479977190_0&#34;,timeMillis:123,timing:{mapTime:51,emitLoop:116,reduceTime :9,模式:&#34;混合&#34;,总计:123},计数:{输入:9474,发出:9474,减少:909,输出:101},确定:1.0,$ gleS tats:{lastOpTime:Timestamp 1479977190000 | 103,electionId:ObjectId(&#39; 7fffffff0000000000000001&#39;)}},Shard-1 / primary.shard1.example.com:27017,secondary.shard1.example.com:27017: {结果:&#34; tmp.mrs.txns_1479977190_0&#34;,timeMillis:71,时间: {mapTime:8,emitLoop:63,reduceTime:4,mode:&#34; mixed&#34;,total:71},count:{input:1526,emit:1526,reduce:197,output:101},ok :1.0,$ gleStats:{lastOpTime:Timestamp 1479977190000 | 103,electionId:ObjectId(&#39; 7fffffff0000000000000001&#39;)}}},shardCounts:{Sha rd-0 / primary.shard0.example.com:27017,secondary.shard0.example.com:27017:{input:9474,emit:9474,reduce:909,output:101},Shard-1 / primary.shard1。 example.com:27017,secondary.shard1.example.com:27017:{inpu t:1526,emit:1526,reduce:197,output:101}},count:{emit:11000,input:11000,output:202,reduce:1106}} keyUpdates:0 writeConflicts:0 numYields:0 reslen:4368 locks:{Global:{acquireCount:{r:2}},Database:{acquireCount:{r:1}},Collection:{acqu ireCount:{r:1}}} protocol:op_command 115ms 2016-11-24T08:46:30.830 + 0000 I COMMAND [conn46] CMD: drop test.tmp.mrs.txns_1479977190_0
关于预期行为的任何指针都非常有用。感谢。
答案 0 :(得分:1)
由于我在这里没有得到答案,我在MongoDB上提交了一个JIRA错误,并发现,截至目前,无法在分片MongoDB集群上在辅助服务器上运行Map-Reduce作业 。这是the bug report。