我需要将MongoDB中我数据库的所有集合作为Hadoop MR作业的输入传递。有一种方法允许多个输入:
MultiCollectionSplitBuilder mcsb = new MultiCollectionSplitBuilder();
mcsb.add(new MongoURI("mongodb://localhost:27017/mongo_hadoop.yield_historical.in"),
(MongoURI)null, // authuri
true, // notimeout
(DBObject)null, // fields
(DBObject)null, // sort
(DBObject)null, // query
false,
MultiMongoCollectionSplitter.class)
.add(new MongoURI("mongodb://localhost:27017/mongo_hadoop.yield_historical.in"),
(MongoURI)null, // authuri
true, // notimeout
(DBObject)null, // fields
(DBObject)null, // sort
new BasicDBObject("_id", new BasicDBObject("$gt", new Date(883440000000L))),
false, // range query
MultiMongoCollectionSplitter.class);
但是我的数据库中有10个集合。上述方法仅允许2个集合争论。 我需要做的就是单独使用mapper方法中的所有集合。我的减速机对所有这些都是一样的。
感谢任何帮助。
答案 0 :(得分:0)
您可以继续添加到MultiCollectionSplitBuilder
MultiCollectionSplitBuilder mcsb = new MultiCollectionSplitBuilder();
mcsb
.add(new MongoURI("mongodb://localhost:27017/mongo_hadoop.yield_historical.in"),
(MongoURI) null, // authuri
true, // notimeout
(DBObject) null, // fields
(DBObject) null, // sort
(DBObject) null, // query
false,
MultiMongoCollectionSplitter.class
)
.add(new MongoURI("mongodb://localhost:27017/mongo_hadoop.yield_historical.in"),
(MongoURI) null, // authuri
true, // notimeout
(DBObject) null, // fields
(DBObject) null, // sort
new BasicDBObject("_id", new BasicDBObject("$gt", new Date(883440000000L))),
false, // range query
MultiMongoCollectionSplitter.class
)
.add(new MongoURI("mongodb://localhost:27017/mongo_hadoop.yield_historical.in"),
(MongoURI) null, // authuri
true, // notimeout
(DBObject) null, // fields
(DBObject) null, // sort
new BasicDBObject("_id", new BasicDBObject("$gt", new Date(883440000000L))),
false, // range query
MultiMongoCollectionSplitter.class
)
;