尝试将json文件加载到配置单元

时间:2017-01-18 02:33:24

标签: hadoop hive hiveql hive-serde

数据如下:

  

{" CUSTID":1185972," movieId":空," genreId":空,"时间":" 2012-07-01:00:00:07""推荐":空,"活动":8}

我正在运行的查询是:

add jar /home/student/hive-0.11.0-bin/lib/json-serde-1.3.7-jar-with-dependencies.jar;

CREATE EXTERNAL TABLE movie_json 
( custId INT, movieId INT, genreId INT, 
time STRING, recommended STRING, activity INT, rating INT, price FLOAT ) 
ROW FORMAT SERDE 'org.openx.data.jsonserde.JsonSerDe' 
LOCATION '/user/oracle/movie/';

遇到的错误是:

  

java.lang.NoSuchFieldError:byteTypeInfo at   org.openx.data.jsonserde.objectinspector.primitive.TypeEntryShim。(TypeEntryShim.java:27)     在   org.openx.data.jsonserde.objectinspector.primitive.JavaStringJsonObjectInspector。(JavaStringJsonObjectInspector.java:14)     在   org.openx.data.jsonserde.objectinspector.JsonObjectInspectorFactory。(JsonObjectInspectorFactory.java:196)     在org.openx.data.jsonserde.JsonSerDe.initialize(JsonSerDe.java:125)     在   org.apache.hadoop.hive.metastore.MetaStoreUtils.getDeserializer(MetaStoreUtils.java:215)     在   org.apache.hadoop.hive.ql.metadata.Table.getDeserializerFromMetaStore(Table.java:268)     在   org.apache.hadoop.hive.ql.metadata.Table.getDeserializer(Table.java:261)     在org.apache.hadoop.hive.ql.metadata.Table.getCols(Table.java:587)     在org.apache.hadoop.hive.ql.metadata.Hive.createTable(Hive.java:576)     在   org.apache.hadoop.hive.ql.exec.DDLTask.createTable(DDLTask.java:3776)     在org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:256)     在org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:144)at   org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:57)     在org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1355)at   org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1139)at   org.apache.hadoop.hive.ql.Driver.run(Driver.java:945)at at   org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:259)     在   org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:216)     在   org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:413)     在org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:756)at   org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:614)at   sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)at   sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)     在   sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)     在java.lang.reflect.Method.invoke(Method.java:597)at   org.apache.hadoop.util.RunJar.main(RunJar.java:156)失败:执行   错误,从org.apache.hadoop.hive.ql.exec.DDLTask返回代码-101

我使用了各种JsonSerder jar但得到了同样的错误。请帮帮我。

1 个答案:

答案 0 :(得分:0)

不确定您使用的JsonSerDe。在这里你可以使用这个JsonSerDe

Hive-JSON-Serde

hive> add jar /User/User1/json-serde-1.3.8-SNAPSHOT-jar-with-dependencies.jar;
Added [/User/User1/json-serde-1.3.8-SNAPSHOT-jar-with-dependencies.jar] to class path
Added resources: [/User/User1/json-serde-1.3.8-SNAPSHOT-jar-with-dependencies.jar]
hive>CREATE EXTERNAL TABLE movie_json ( custId INT, movieId INT, genreId INT, time STRING, recommended STRING, activity INT, rating INT, price FLOAT ) ROW FORMAT SERDE'org.openx.data.jsonserde.JsonSerDe'
LOCATION'/user/oracle/movie/';
OK
Time taken: 0.097 seconds

您可以使用

构建Jar
C:\Users\User1\Downloads\Hive-JSON-Serde-develop\Hive-JSON-Serde-develop>mvn -Phdp23 clean package.
-Phdp23 is hdp2.3 it should be replaced with your hadoop version.

您还可以使用内置的jsonserde get_json_object json_tuple如果您正在寻找使用内置jsonserde的示例,请查看此示例 Example

我建议验证您的Json Validate JSON