如何将json转换为数组

时间:2018-04-04 10:07:00

标签: scala apache-spark

我的输入如下所示。

val inputJson ="""[{"color": "red","value": "#f00"},{"color": "blue","value": "#00f"}]"""

我需要将JSON val转换为ARRAY 我的输出应如下所示。

 val colorval=Array("red","blue")
val value=Array("#f00","#00f")

请帮助

2 个答案:

答案 0 :(得分:1)

如果您拥有大型数据集,以下解决方案可以为您提供帮助。

//input data I guess you have large data
val inputJson ="""[{"color": "red","value": "#f00"},{"color": "blue","value": "#00f"}]"""

//read the json data to dataframe
val df = sqlContext.read.json(sc.parallelize(inputJson::Nil))

//apply the collecting inbuilt functions
import org.apache.spark.sql.functions.collect_list
df.select(collect_list("color").as("colorVal"), collect_list("value").as("value"))

你应该

+-----------+------------+
|colorVal   |value       |
+-----------+------------+
|[red, blue]|[#f00, #00f]|
+-----------+------------+

root
 |-- colorVal: array (nullable = true)
 |    |-- element: string (containsNull = true)
 |-- value: array (nullable = true)
 |    |-- element: string (containsNull = true)

答案 1 :(得分:0)

从JSON创建一个DataFrame并将其展开。现在使用collect_list()或collect_set()取决于您是否需要重复项。