我正在Hive外部表中从Twitter读取一行json
数据。该表已创建,但在读取数据时,出现错误。我想阅读标签。我已按照以下步骤操作:
hive (test)> add jar /usr/lib/hive/lib/hive-serdes-1.0-SNAPSHOT.jar;
Added /usr/lib/hive/lib/hive-serdes-1.0-SNAPSHOT.jar to class path
Added resource: /usr/lib/hive/lib/hive-serdes-1.0-SNAPSHOT.jar
档案中的数据:
hive (test)> dfs -cat abhijit_hdfs/flume2/tweets/Twitter_test.js;
"entities":{"symbols":[],"urls":[],"hashtags":[{"text":"AchieveMore","indices":[56,68]}]}
DDL声明
hive (test)> create external table tt4
> (entities struct<hashtags:array<struct<text:string>>>)
> row format serde 'com.cloudera.hive.serde.JSONSerDe'
> LOCATION '/user/training/abhijit_hdfs/flume2/tweets/' ;
OK
Time taken: 0.193 seconds.
hive (test)> select * from tt4;
OK
Failed with exception java.io.IOException:org.apache.hadoop.hive.serde2.SerDeException: org.codehaus.jackson.map.JsonMappingException: Can not deserialize instance of java.util.LinkedHashMap out of VALUE_STRING token
at [Source: java.io.StringReader@1cc892e; line: 1, column: 1]
Time taken: 0.384 seconds
请指南。
答案 0 :(得分:0)
这看起来像是非Hadoop或hive相关的问题,而不是JSON序列化程序错误,您指向内部的serde使用org.codehaus.jackson
尝试使用JSON时似乎有这个错误
`Error: Parse error on line 1:"entities":{"symbols":[],"urls
----------^
Expecting 'EOF', '}', ',', ']', got ':'`
我没有尝试过整个设置,但JSON似乎缺少{开始时是一个很好的可解析的JSON
{"entities":{"symbols":[],"urls":[],"hashtags":[{"text":"AchieveMore","indices":[56,68]}]}}
答案 1 :(得分:0)
在使用hcatalog JsonSerDe
时添加周围的卷曲括号({...}
)后它确实有效
create external table tt4
(
entities struct<hashtags:array<struct<text:string>>>
)
row format serde 'org.apache.hive.hcatalog.data.JsonSerDe'
;
select * from tt4
;
+---------------------------------------+
| entities |
+---------------------------------------+
| {"hashtags":[{"text":"AchieveMore"}]} |
+---------------------------------------+
JsonSerde for JSON文件在Hive 0.12及更高版本中可用。
在某些发行版中,对 hive-hcatalog-core.jar 的引用是 需要。添加JAR /usr/lib/hive-hcatalog/lib/hive-hcatalog-core.jar;
...
JsonSerDe从HCatalog转移到Hive,然后才进入 hive-contrib项目。它被添加到Hive发行版中 HIVE-4895。
答案 2 :(得分:0)
亲爱的朋友这个问题已经解决了我下载并保存在jar后面并重新启动了我的克劳德拉VM(非商业用途)。谢谢你的帮助,这给了我解决它的方向。
hive> add jar /usr/lib/hive/lib/json-serde-1.3.6-jar-with-dependencies.jar;
Added /usr/lib/hive/lib/json-serde-1.3.6-jar-with-dependencies.jar to class path
Added resource: /usr/lib/hive/lib/json-serde-1.3.6-jar-with-dependencies.jar
hive> create external table t24
(entities struct<hashtags:array<struct<text:string>>>) row format serde 'org.openx.data.jsonserde.JsonSerDe' LOCATION '/user/training/abhijit_hdfs/flume4/tweets/' ; OK
Time taken: 1.623 seconds hive> select * from t24; OK {"hashtags":[{"text":"AchieveMore"}]} null Time taken: 1.13 seconds hive>