json文件:
{
"DocId":"ABC",
"User":{
"Id":1234,
"Username":"sam1234",
"Name":"Sam",
"ShippingAddress":{
"Address1":"123 Main St.",
"Address2":null,
"City":"Durham",
"State":"NC"
},
"Orders":[{
"ItemId":6789,
"OrderDate":"11/11/2012"
},
{
"ItemId":4352,
"OrderDate":"12/12/2012"
}
]
}
}}
模式:
create external table sample_json(DocId string,User struct<Id:int,Username:string,Name:string,ShippingAddress:struct<Address1:string,Address2:string,City:string,State:string>,Orders:array<struct<ItemId:int,OrderDate:string>>>)ROW FORMAT SERDE 'com.cloudera.hive.serde.JSONSerDe' location '/user/babu/sample_json';
- 将数据加载到配置单元
将路径'/user/samplejson/samplejson.json'中的数据加载到表sample_json中;
错误:
当我触发像
这样的选择查询时select * from sample_json;
例外:
异常失败 产生java.io.IOException:org.apache.hadoop.hive.serde2.SerDeException: org.codehaus.jackson.JsonParseException:意外的输入结束: OBJECT的预期关闭标记(来自: java.io.StringReader@8c3770; line:1,column:0])at [来源: java.io.StringReader@8c3770; line:1,column:3]
答案 0 :(得分:0)
首先请确保json文件通过http://jsonlint.com有效,然后在将文件加载到hive表之前删除json文件中的任何换行符或不需要的空格。如果您已经将带有换行符的json文件加载到表中,请删除该表并创建一个新表。
以下是您可以尝试的输入
{"DocId":"ABC",
"User":{"Id":1234,
"Username":"sam1234",
"Name":"Sam",
"ShippingAddress":{"Address1":"123 Main St.","Address2":null,"City":"Durham","State":"NC"},
"Orders":[{"ItemId":6789,"OrderDate":"11/11/2012"},
{"ItemId":4352,"OrderDate":"12/12/2012"}
]
}
}
答案 1 :(得分:0)
{“ DocId”:“ ABC”,“ Userdetails”:{“ Id”:1234,“ Username”:“ sam1234”,“ Name”:“ Sam”,“ ShippingAddress”:{“ Address1”:“ 123 Main St。“,” Address2“:空,” City“:” Durham“,” State“:” NC“},” Orders“:[{” ItemId“:6789,” OrderDate“:” 11/11/2012 “},{” ItemId“:4352,” OrderDate“:” 12/12/2012“}]}}
以下是命令:
创建外部表sample_json(DocId字符串,用户详细信息struct
从sample_json中选择*; 好 sample_json.docid sample_json.userdetails ABC {“ id”:1234,“ username”:“ sam1234”,“ name”:“ Sam”,“ shippingaddress”:{“ address1”:“ 123 Main St。”,“ address2”:null,“ city”: “ Durham”,“ state”:“ NC”},“ orders”:[{“ itemid”:6789,“ orderdate”:“ 11/11/2012”},{“ itemid”:4352,“ orderdate”:“ 2012年12月12日“}]} 花费时间:0.106秒,获取:1行