我有一个变量,其构造如下使用Spark SQL提取数据:
{
"resourceType" : "Test1",
"count" : 10,
"entry": [{
"id": "112",
"gender": "female",
"birthDate": 1213999
}, {
"id": "urn:uuid:002e27cf-3cae-4393-89c5-1b78050d9428",
"resourceType": "Encounter"
}]
}
我希望输出格式如下:
{
"resourceType" : "Test1",
"count" : 10,
"entry" :[
"resource" :{
"id": "112",
"gender": "female",
"birthDate": 1213999
},
"resource" :{
"id": "urn:uuid:002e27cf-3cae-4393-89c5-1b78050d9428",
"resourceType": "Encounter"
}]
}
我基本上是Scala的新手:),需要帮助。
编辑:添加scala代码以创建JSON:
val bundle = endresult.groupBy("id").agg(count("*") as "total",collect_list("resource") as "entry").
withColumn("resourceType", lit("Bundle")).
drop("id").
select(to_json(struct("resourceType","entry"))).
map(row => row.getString(0).
replace("\"entry\":[\"{", "\"entry\":[{").
replace("}\"]}","}]}"). // Should be at the end of the string ONLY (we might switch to regex instead
replace("}\",\"{","},{")
replace("\\\"", "\"")
)