是否可以使用MongoDB api将Spark连接到Cosmos DB?

时间:2018-02-19 14:53:31

标签: mongodb azure apache-spark azure-cosmosdb hdinsight

我在Spark中使用HDInsight。目前,我能够使用Spark直接从Cosmos DB(SQL API)编写和读取数据。 使用:

spark.jars.packages     com.microsoft.azure:azure-cosmosdb-spark_2.1.0_2.11:1.0.0

我无法找到使用MongoDB API将Spark直接连接到Cosmos DB的方法。我已经尝试了上面的配置,以及Spark的MongoDB连接器,但没有成功。

spark.jars.packages    org.mongodb.spark:mongo-spark-connector_2.11:2.1.0

那么,可以将Spark连接到使用MongoDB API的Cosmos DB吗?

我收到此错误:

  

ERROR执行程序:阶段0.0(TID 0)中任务0.0的异常   org.bson.BsonInvalidOperationException:文档不包含键   光标           在org.bson.BsonDocument.throwIfKeyAbsent(BsonDocument.java:844)           在org.bson.BsonDocument.getDocument(BsonDocument.java:135)           在com.mongodb.operation.AggregateOperation.createQueryResult(AggregateOperation.java:359)           在com.mongodb.operation.AggregateOperation.access $ 700(AggregateOperation.java:67)           在com.mongodb.operation.AggregateOperation $ 3.apply(AggregateOperation.java:367)           在com.mongodb.operation.AggregateOperation $ 3.apply(AggregateOperation.java:364)           在com.mongodb.operation.CommandOperationHelper.executeWrappedCommandProtocol(CommandOperationHelper.java:216)           在com.mongodb.operation.CommandOperationHelper.executeWrappedCommandProtocol(CommandOperationHelper.java:207)           在com.mongodb.operation.CommandOperationHelper.executeWrappedCommandProtocol(CommandOperationHelper.java:113)           在com.mongodb.operation.AggregateOperation $ 1.call(AggregateOperation.java:257)           在com.mongodb.operation.AggregateOperation $ 1.call(AggregateOperation.java:253)           在com.mongodb.operation.OperationHelper.withConnectionSource(OperationHelper.java:431)           在com.mongodb.operation.OperationHelper.withConnection(OperationHelper.java:404)           在com.mongodb.operation.AggregateOperation.execute(AggregateOperation.java:253)           在com.mongodb.operation.AggregateOperation.execute(AggregateOperation.java:67)           在com.mongodb.Mongo.execute(Mongo.java:836)           在com.mongodb.Mongo $ 2.execute(Mongo.java:823)           在com.mongodb.OperationIterable.iterator(OperationIterable.java:47)           在com.mongodb.AggregateIterableImpl.iterator(AggregateIterableImpl.java:123)           在com.mongodb.spark.rdd.MongoRDD.getCursor(MongoRDD.scala:167)           在com.mongodb.spark.rdd.MongoRDD.compute(MongoRDD.scala:142)           在org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323)           在org.apache.spark.rdd.RDD.iterator(RDD.scala:287)           在org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)           在org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323)           在org.apache.spark.rdd.RDD.iterator(RDD.scala:287)           在org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)           在org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323)           在org.apache.spark.rdd.RDD.iterator(RDD.scala:287)           在org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87)           在org.apache.spark.scheduler.Task.run(Task.scala:99)           在org.apache.spark.executor.Executor $ TaskRunner.run(Executor.scala:325)           在java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)           at java.util.concurrent.ThreadPoolExecutor $ Worker.run(ThreadPoolExecutor.java:617)           在java.lang.Thread.run(Thread.java:748)

0 个答案:

没有答案