在下面的代码中,我尝试使用选项映射在readConfig中传递mongo uri和数据库。但它没有找到uri或数据库的错误。
`
public JavaMongoRDD<Document> getRDDFromDS(DataSourceInfo ds, String collectionName){
String mongoDBURI = "mongodb://"
+ PropertiesFileEncryptorUtil.decryptData(ds.getDbUsername()) + ":"
+ PropertiesFileEncryptorUtil.decryptData(ds.getDbPassword()) + "@"
+ ds.getHostName() + ":" + ds.getPort();
Map<String, String> readOverrides = new HashMap<String, String>();
readOverrides.put("uri", mongoDBURI);
readOverrides.put("database", ds.getDbName());
readOverrides.put("collection", collectionName);
readOverrides.put("partitioner", mongoDBInputPartitioner);
readOverrides.put("partitionKey", mongoDBPartitionKey);
readOverrides.put("partitionSizeMB", mongoDBInputPartitionSize);
ReadConfig readConf = ReadConfig.create(jsc).withOptions(readOverrides);
JavaMongoRDD<Document> readRdd = MongoSpark.load(jsc, readConf);
return readRdd;
}`
传递uri和数据库的正确方法是什么。 提前谢谢。
答案 0 :(得分:0)
您可以通过配置变量将配置参数传递给spark:
val conf = new SparkConf().setAppName("YourAppName").setMaster("local[2]").set("spark.executor.memory","1g")
.set("spark.app.id","YourSparkId")
.set("spark.mongodb.input.uri","mongodb://127.0.0.1/yourdatabase.yourInputcollection?readPreference=primaryPreferred")
.set("spark.mongodb.output.uri","mongodb://127.0.0.1/yourdatabase.yourOutputcollection")
之后,您需要将配置变量赋予spark上下文:
val sc = new SparkContext(conf)
val readConf = ReadConfig( sc )
然后你可以像这样读取mongo中的值:
val rdd = sc.loadFromMongoDB( readConfig = readConfig )
并保存如下:
rdd.map( someMapFunction ).saveToMongoDB()
我希望我的回答很有帮助