我们在CDH5上使用Hive 0.12。我们使用它将JSON记录转换为柱形格式,使用https://github.com/rcongiu/Hive-JSON-Serde中的org.openx.data.jsonserde.JsonSerDe。
我们读取的外部表定义为:
add jar json-serde-1.3-SNAPSHOT-jar-with-dependencies.jar;
set hive.exec.dynamic.partition.mode=nonstrict;
set hive.exec.max.dynamic.partitions.pernode=366;
set hive.stats.autogather=false;
use my_db;
drop table if exists my_table;
create table if not exists my_table (
...
)
partitioned by (
year string,
month string,
day string,
hour string
)
row format serde 'org.openx.data.jsonserde.JsonSerDe'
location '/user/camus/incoming/my_data/hourly';
alter table my_table add partition
(year='2014', month='08', day='26', hour='17')
location '/2014/08/26/17';
我们可以毫无错误地执行此代码。 但是,当我们在Hive中查询时:
add jar json-serde-1.3-SNAPSHOT-jar-with-dependencies.jar;
select * from mytable;
我们得到以下异常:
Added json-serde-1.3-SNAPSHOT-jar-with-dependencies.jar to class path
Added resource: json-serde-1.3-SNAPSHOT-jar-with-dependencies.jar
FAILED: RuntimeException org.apache.hadoop.hive.ql.metadata.HiveException: Failed with exception
java.lang.ClassNotFoundException: org.openx.data.jsonserde.JsonSerDejava.lang.RuntimeException: java.lang.ClassNotFoundException: org.openx.data.jsonserde.JsonSerDe
at org.apache.hadoop.hive.ql.plan.TableDesc.getDeserializerClass(TableDesc.java:68)
at org.apache.hadoop.hive.ql.exec.FetchOperator.getOutputObjectInspector(FetchOperator.java:624)
at org.apache.hadoop.hive.ql.exec.FetchTask.initialize(FetchTask.java:80)
at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:497)
at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:352)
at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:995)
at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1038)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:931)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:921)
at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:268)
at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:220)
at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:422)
at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:357)
at org.apache.hadoop.hive.cli.CliDriver.processReader(CliDriver.java:455)
at org.apache.hadoop.hive.cli.CliDriver.processFile(CliDriver.java:465)
at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:125)
at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:422)
at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:790)
at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:684)
at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:623)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
Caused by: java.lang.ClassNotFoundException: org.openx.data.jsonserde.JsonSerDe
at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
at java.lang.ClassLoader.loadClass(ClassLoader.java:247)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:169)
at org.apache.hadoop.hive.ql.plan.TableDesc.getDeserializerClass(TableDesc.java:66)
... 24 more
可能是什么问题?
答案 0 :(得分:2)
我使用不同的serde也有同样的错误,但也许你可以用同样的方法解决它。我在我的计算机上将jar添加到/ usr / lib / hive / lib。 (CENTOS)
重新启动并启动配置单元后,我使用add jar命令引用lib中的这个jar。这样做之后,它对我有用。