我有一个简单的结构化流应用程序,输出接收器应为CosmosDB。当我调用writeStream方法时,弹出以下错误。添加到集群的库的版本为:
com.microsoft.azure:azure-cosmosdb-spark_2.4.0_2.11:1.4.1, type:Maven
我的代码如下:
val outstream = staticInputDF
.writeStream
.format(classOf[CosmosDBSinkProvider].getName)
.options(config)
.start
.awaitTermination
这会导致错误:
command-751666472135258:74:错误:方法值选项重载 以及其他选择:(选项: java.util.Map [String,String])org.apache.spark.sql.streaming.DataStreamWriter [org.apache.spark.sql.Row] (选项: scala.collection.Map [String,String])org.apache.spark.sql.streaming.DataStreamWriter [org.apache.spark.sql.Row] 不适用于 (com.microsoft.azure.cosmosdb.spark.config.Config)
如何从流数据帧写入一个CosmosDB集合?
答案 0 :(得分:0)
以下代码显示了如何将数据帧写入Cosmos DB。
// Write configuration
val writeConfig = Config(Map(
"Endpoint" -> "https://doctorwho.documents.azure.com:443/",
"Masterkey" -> "YOUR-KEY-HERE",
"Database" -> "DepartureDelays",
"Collection" -> "flights_fromsea",
"Upsert" -> "true",
"WritingBatchSize" -> "500",
"CheckpointLocation" -> "/checkpointlocation_write1"
))
// Write to Cosmos DB from the flights DataFrame
df
.writeStream
.format(classOf[CosmosDBSinkProvider].getName)
.options(writeConfig)
.start()
参考: Azure Databricks Spark Connecter
希望这会有所帮助。
答案 1 :(得分:0)
该错误表明config
是com.microsoft.azure.cosmosdb.spark.config.Config
类型的,但是您只能将.options(config)
与java.util.Map[String,String]
或scala.collection.Map[String,String]
一起使用。
在Stream data to from Kafka to Cosmos DB笔记本中使用以下Map
的地方签出:
val configMap = Map(
"Endpoint" -> "YOUR_COSMOSDB_ENDPOINT",
"Masterkey" -> "YOUR_MASTER_KEY",
"Database" -> "kafkadata",
// use a ';' to delimit multiple regions
"PreferredRegions" -> "West US;",
"Collection" -> "kafkacollection"
)