Databricks / Spark到mongoDB:IllegalArgumentException:“要求失败:uri无效

时间:2019-07-19 15:51:36

标签: mongodb azure apache-spark azure-databricks

我在Azure Databricks中遇到这种错误,因为我想将数据帧写入mongoDB实例(不是Atlas,而是Azure中的kubernetes无状态群集,可以通过IP进行访问。)

我可以通过mongo Shell访问我的mongoDB,在那里一切似乎都很好。

我将我的Spark集群配置设置为

spark.mongodb.input.uri mongodb://<IP>:<Port>?replicaSet=MainRepSet
spark.mongodb.output.uri mongodb://<IP>:<Port>?replicaSet=MainRepSet

使用pyspark,Databricks 5.4(包括Apache Spark 2.4.3,Scala 2.11),Kubernetes 1.12.8上的mongoDB版本3.4.21。在Databricks中,我安装了mongodb.spark:mongo-spark-connector_2.11:2.3.1猜猜它与mongoDB无关,而是连接器中缺少参数。

我的最小应用程序(在Azure Databrick-notebook中运行):

from pyspark.sql import SparkSession
my_spark = SparkSession.builder.appName("myApp").getOrCreate()

people = spark.createDataFrame([("Bilbo Baggins",  50), ("Gandalf", 1000), ("Thorin", 195), ("Balin", 178), ("Kili", 77),
   ("Dwalin", 169), ("Oin", 167), ("Gloin", 158), ("Fili", 82), ("Bombur", None)], ["name", "age"])

people.write.format("com.mongodb.spark.sql.DefaultSource").option("database", "test").option("collection", "test").mode("append").save()

完全错误消息:

IllegalArgumentException                  Traceback (most recent call last)
<command-2936697920073069> in <module>()
----> 1 people.write.format("com.mongodb.spark.sql.DefaultSource").option("database", "test").option("collection", "test").mode("append").save()

/databricks/spark/python/pyspark/sql/readwriter.py in save(self, path, format, mode, partitionBy, **options)
    730             self.format(format)
    731         if path is None:
--> 732             self._jwrite.save()
    733         else:
    734             self._jwrite.save(path)

/databricks/spark/python/lib/py4j-0.10.7-src.zip/py4j/java_gateway.py in __call__(self, *args)
   1255         answer = self.gateway_client.send_command(command)
   1256         return_value = get_return_value(
-> 1257             answer, self.gateway_client, self.target_id, self.name)
   1258 
   1259         for temp_arg in temp_args:

/databricks/spark/python/pyspark/sql/utils.py in deco(*a, **kw)
     77                 raise QueryExecutionException(s.split(': ', 1)[1], stackTrace)
     78             if s.startswith('java.lang.IllegalArgumentException: '):
---> 79                 raise IllegalArgumentException(s.split(': ', 1)[1], stackTrace)
     80             raise
     81     return deco

0 个答案:

没有答案