通过spark-SQL获取JSON数据

时间:2017-02-08 10:10:46

标签: json apache-spark apache-spark-sql spark-dataframe

当我尝试使用Spark-SQL查询获取嵌套的JSON数据时:

SparkContext sc = new SparkContext(new SparkConf().setAppName("sql").setMaster("local"));
SQLContext  sqlContext = new SQLContext(sc);
DataFrame df = sqlContext.read().json("path_to_s3_bucket").cache();
df.registerTempTable("table_name");
DataFrame d=sqlContext.sql("Select address.state as state from table_name");

我遇到了以下异常

Exception in thread "main" org.apache.spark.sql.AnalysisException: Can't extract value from address

我的Json数据如下: -

"address":{"city":"xyz","state":"abc","country":"pqr"}

请帮助解决问题。

1 个答案:

答案 0 :(得分:0)

你的json无效。它应该是:                                                                                        { “地址”:{ “城市”: “XYZ”, “状态”: “ABC”, “国”: “PQR”}}