我可以使用JSON Serde org.openx.data.jsonserde.JsonSerDe
创建Hive表,但是当我从Hive表中读取数据时,我无法读取。
hive> create table emp (EmpId int , EmpFirstName string , EmpLastName string) ROW FORMAT SERDE 'org.openx.data.jsonserde.JsonSerDe';
OK
Time taken: 2.148 seconds
hive> LOAD DATA INPATH '/user/cloudera/EmpData/emp.json' INTO table emp;
Loading data to table employee.emp
chgrp: changing ownership of 'hdfs://quickstart.cloudera:8020/user/hive/warehouse/employee.db/emp/emp.json': User does not belong to supergroup
Table employee.emp stats: [numFiles=1, totalSize=4163]
OK
Time taken: 1.141 seconds
hive> select * from emp;
OK
Failed with exception java.io.IOException:org.apache.hadoop.hive.serde2.SerDeException: Row is not a valid JSON Object - JSONException: A JSONObject text must end with '}' at 2 [character 3 line 1]
Time taken: 0.504 seconds
答案 0 :(得分:1)
错误:异常失败java.io.IOException:org.apache.hadoop.hive.serde2.SerDeException:行不是有效的JSON对象 - JSONException:JSONObject文本必须以'结尾' }'在2 [字符3第1行]
检查/user/cloudera/EmpData/emp.json中提供的json是否有效
您可以通过
消除无效行ALTER TABLE table emp SET SERDEPROPERTIES ( "ignore.malformed.json" = "true");
检查此链接 - > https://github.com/rcongiu/Hive-JSON-Serde
编辑: 这是无效的json
{ "cols": [ "EmpId", "EmpFirstName", "EmpLastName" ], "data": [ [ 1, "Hannah", "Walton" ], [ 2, "Barrett", "Mendoza" ], [ 3, "Camden", "Kidd" ], [ 4, "Illiana", "Collier" ] ] }
你提供的json
key:cols and value:[ "EmpId", "EmpFirstName", "EmpLastName" ]
和
key :data and value :[ [ 1, "Hannah", "Walton" ], [ 2, "Barrett", "Mendoza" ], [ 3, "Camden", "Kidd" ], [ 4, "Illiana", "Collier" ]
json应该像
{"EmpId":1,"EmpFirstName":"Hannah","EmpLastName":"Walton"}
{"EmpId":2,"EmpFirstName":"Barrett","EmpLastName":"Mendoza"}
{"EmpId":3,"EmpFirstName":"Camden","EmpLastName":"Kidd"}