我在Cloudera VM中使用HIVE查询创建了table
,下面是我的DDL
来创建名为incremental_tweets
的表。
CREATE EXTERNAL TABLE incremental_tweets (
id BIGINT,
created_at STRING,
source STRING,
favorited BOOLEAN,
retweet_count INT,
retweeted_status STRUCT<
text:STRING,
user:STRUCT<screen_name:STRING,name:STRING>>,
entities STRUCT<
urls:ARRAY<STRUCT<expanded_url:STRING>>,
user_mentions:ARRAY<STRUCT<screen_name:STRING,name:STRING>>,
hashtags:ARRAY<STRUCT<text:STRING>>>,
text STRING,
user STRUCT<
screen_name:STRING,
name:STRING,
friends_count:INT,
followers_count:INT,
statuses_count:INT,
verified:BOOLEAN,
utc_offset:INT,
time_zone:STRING>,
in_reply_to_screen_name STRING
)
ROW FORMAT SERDE 'com.cloudera.hive.serde.JSONSerDe'
LOCATION '/twitteranalytics/incremental/';
在HUE HIVE Editor
上执行此操作后,表成功创建,现在问题是我无法执行SELECT
语句,这会引发以下错误。
SELECT Statement
Select id, entities.user_mentions.name FROM incremental_tweets;
ERROR
Error while processing statement: FAILED: Execution Error, return code 2
from org.apache.hadoop.hive.ql.exec.mr.MapRedTask
此外,由于HUE editor
提供了自动完成功能,因此下面是语句及其给出的错误。
Statement
Select id, entities.`,user_mentions`.name FROM incremental_tweets;
ERROR
Error while compiling statement: FAILED: RuntimeException cannot find field
,user_mentions(lowercase form: ,user_mentions) in [urls, user_mentions,
hashtags]
什么是正确的SELECT statement
?我错过了任何语法吗?
答案 0 :(得分:0)
user_mentions
是一个struct数组。您只能通过指定数组索引来解决内部struct元素:
entities.user_mentions[0].name --get name from first array element
如果要选择所有数组元素,请使用explode()
+ lateral view
:
select id, user_mention.name
from incremental_tweets
lateral view outer explode(entities.user_mentions) s as user_mention
;