将文件从Spark写入CosmosDB的问题

时间:2018-12-05 06:05:18

标签: azure apache-spark azure-cosmosdb

我目前正在Azure环境中学习cosmosDB。我正在尝试建立与CosmosDB的连接,以将Json文件从spark写入cosmosDB。


import com.microsoft.azure.cosmosdb.spark.schema._ 
import com.microsoft.azure.cosmosdb.spark._
import com.microsoft.azure.cosmosdb.spark.config.Config

val b=spark.read.option("multiline", "true").json("wasb://hdi-2018-12-04t03- 
00-20-107z@storage.blob.core.windows.net/hdp/file.json")
val c=b.registerTempTable("sathya")
val d=spark.sqlContext.sql("select * from sathya")

val writeConfigMap = Map(
"Endpoint" -> "https://testy.documents.azure.com:443/",
"Masterkey" -> 
"pKIrXH4coeqJYdloN9tKlOZkGa3arbj7SpwR7V9ryNxjOUNU08Ne0rEp6LXsamEz0YF7ew==",
"Database" -> "newdbcosmos",
"Collection" -> "newcollcosmos", 
"preferredRegions" -> "US East",
"SamplingRatio" -> "1.0",
"schema_samplesize" -> "200000"
) 

写入CosmosDB时出错:

scala> d.write.cosmosDB(writeConfigMap)
<console>:41: error: type mismatch;
**found   : scala.collection.immutable.Map[String,String]
required: com.microsoft.azure.cosmosdb.spark.config.Config
d.write.cosmosDB(writeConfigMap)**

我已经阅读了互联网上的文档,并且上传了cosmosDB连接器的uber jar。任何人都遇到这个问题并分享解决方案。

谢谢 萨提亚

1 个答案:

答案 0 :(得分:0)

请通过以下方式设置cosmos配置:

import com.microsoft.azure.cosmosdb.spark.config.Config
val cmosConfig = Config(Map(
    "Endpoint" -> "https://testy.documents.azure.com:443/",
    "Masterkey" -> 
    "pKIrXH4coeqJYdloN9tKlOZkGa3arbj7SpwR7V9ryNxjOUNU08Ne0rEp6LXsamEz0YF7ew==",
    "Database" -> "newdbcosmos",
    "Collection" -> "newcollcosmos", 
    "preferredRegions" -> "US East",
    "SamplingRatio" -> "1.0",
    "schema_samplesize" -> "200000"))