我想在spark 2.0(scala)中解析json文件。接下来我想在Hive表中保存数据.. 我如何使用scala解析json文件? json文件示例)metadata.json:
{
"syslog": {
"month": "Sep",
"day": "26",
"time": "23:03:44",
"host": "cdpcapital.onmicrosoft.com"
},
"prefix": {
"cef_version": "CEF:0",
"device_vendor": "Microsoft",
"device_product": "SharePoint Online",
},
"extensions": {
"eventId": "7808891",
"msg": "ManagedSyncClientAllowed",
"art": "1506467022378",
"cat": "SharePoint",
"act": "ManagedSyncClientAllowed",
"rt": "1506466717000",
"requestClientApplication": "Microsoft SkyDriveSync",
"cs1": "0bdbe027-8f50-4ec3-843f-e27c41a63957",
"cs1Label": "Organization ID",
"cs2Label": "Modified Properties",
"ahost": "cdpdiclog101.cgimss.com",
"agentZoneURI": "/All Zones",
"amac": "F0-1F-AF-DA-8F-1B",
"av": "7.6.0.8009.0",
}
},
感谢
答案 0 :(得分:0)
您可以使用以下内容:
val jsonDf = sparkSession
.read
//.option("wholeFile", true) if its not a Single Line JSON
.json("resources/json/metadata.json")
jsonDf.printSchema()
jsonDf.registerTempTable("metadata")
有关此https://spark.apache.org/docs/latest/sql-programming-guide.html#hive-tables
的更多详细信息