Question

我已在数据库级别上启用了授权（因此要读取数据库，您需要指定用户名和密码，而不要登录mongodb），并且正在尝试读取使用MongoSpark.load（）将mongodb集合中的数据导入RDD。

我将readConfig指定为：

val readConfig = ReadConfig(Map("spark.mongodb.input.uri" ->
  "mongodb://<username>:<password>@<mongo-host>:<mongo-port = 27017>/<db-name>.<collection-name>",
  "spark.mongodb.input.partitioner" -> "MongoSplitVectorPartitioner"))

我读取了RDD并根据以下预定义的列对其进行过滤：

val rdd = MongoSpark.load(sc, readConfig).withPipeline(Seq(Document.parse("{ $match: {\"isSatisfyingCondition\" :{$eq: true} } }"))).toDf

val processedRdd = rdd.select("_id", "col1").collect()

由于collect（）动作，在处理的Rdd行上出现以下错误：

User class threw exception: com.mongodb.MongoCommandException: Command failed with error 13: 'not authorized on <db-name> to execute command { splitVector: "<db-name>.<collection-name>",

我的直觉是我没有将用户名和密码显式传递给分区程序，而是需要这样做，以便它可以访问/读取数据库。

如何传递用户名和密码，以便允许在数据库上使用它？还是我完全在错误的地方看，导致此错误的原因不同？该用户不是管理员。我可以使用以下命令从命令行连接到数据库：

mongo <db-name> --host <mongo-host>:<mongo-port> --username <username> --password <password>

更新（问题仍未解决）：当我从readConfig中删除分区程序时，没有错误。有人可以解释如何在readConfig中指定分区程序与此错误相关？

删除分区程序后，readConfig现在读取：

val readConfig = ReadConfig(Map("spark.mongodb.input.uri" ->
  "mongodb://<username>:<password>@<mongo-host>:<mongo-port = 27017>/<db-name>.<collection-name>"))

数据库级授权的mongodb分区失败错误“未在<dbname>上授权执行命令{splitVector：“

0 个答案: