我正在尝试将数据帧加载到json数据中这是我的示例数据
import org.apache.spark.sql._
import org.apache.spark.sql.types._
import org.apache.spark.sql.functions.lit
val df = Seq((2012, 8, "Batman", 9.8), (2012, 8, "Hero", 8.7), (2012, 7, "Robot", 5.5), (2011, 7, "Git", 2.0)).toDF("year", "month", "title", "rating")
我正在将数据转换为json对象
import org.apache.spark.sql.functions._
val finalJsonDF = df.select(to_json(struct("year", "month", "title", "rating"))).as("test")
我可以查看数据,但数据结构是
structstojson(named_struct(NamePlaceholder(), year, NamePlaceholder(), month, NamePlaceholder(), title, NamePlaceholder(), rating))
现在我正在尝试创建一个表并将数据帧加载到表中
finalJsonDF.show()
finalJsonDF.write.json("s3://tmp/test")
spark.sql(""" drop table if exists test.sample_test""")
spark.sql("""create external table test.sample_test
(
test struct<year:String,
month:String,
title:String,
rating:String>
)
ROW FORMAT SERDE 'org.apache.hive.hcatalog.data.JsonSerDe'
stored as TextFile
location "s3://tmp/test"
""")
spark.sql(""" describe test.sample_test""").show()
spark.sql(""" select * from test.sample_test""").show()
我只能看到空行。