Spark Dataframe以错误的格式保存到MongoDB

时间:2016-09-08 11:27:03

标签: mongodb scala apache-spark spark-dataframe

我正在使用Spark-MongoDB 我正在尝试将DataFrame保存到MongoDB中:

val event = """{"Dev":[{"a":3},{"b":3}],"hr":[{"a":6}]}"""
val events = sc.parallelize(event :: Nil)
val df = sqlc.read.json(events)
val saveConfig = MongodbConfigBuilder(Map(Host -> List("localhost:27017"),
 Database -> "test", Collection -> "test", SamplingRatio -> 1.0, WriteConcern -> "normal",
 SplitSize -> 8, SplitKey -> "_id"))
df.saveToMongodb(saveConfig.build)

我希望将数据保存为输入字符串,但实际保存的是:

  

{" _id" :ObjectId(" 57cedf4bd244c56e8e783a45")," Dev" :[{" a" :NumberLong(3)," b" :null},{" a" :null," b" :NumberLong(3)}]," hr" :[{" a" :NumberLong(6)}]}

我想避免那些空值和重复,不知道吗?

1 个答案:

答案 0 :(得分:0)

您是否尝试过使用反斜杠定义的事件:

val event = "{\"Dev\":[{\"a\":3},{\"b\":3}],\"hr\":[{\"a\":6}]}"