所以我正在尝试创建一个Hive Schema来分析存储在hdfs中的json
数据。我指的是this blog用于创建Hive表,下面是我的Schema.hql
CREATE EXTERNAL TABLE base_tweets4 (
`id` BIGINT,
created_at STRING,
`source` STRING,
favorited BOOLEAN,
retweet_count INT,
retweeted_status STRUCT<
text:STRING,
`user`:STRUCT<screen_name:STRING,name:STRING>>,
`entities` STRUCT<
urls:ARRAY<STRUCT<expanded_url:STRING>>,
user_mentions:ARRAY<STRUCT<screen_name:STRING,name:STRING>>,
hashtags:ARRAY<STRUCT<text:STRING>>>,
text STRING,
`user` STRUCT<
screen_name:STRING,
name:STRING,
friends_count:INT,
followers_count:INT,
statuses_count:INT,
verified:BOOLEAN,
utc_offset:INT,
time_zone:STRING>,
in_reply_to_screen_name STRING
)
ROW FORMAT SERDE 'com.cloudera.hive.serde.JSONSerDe'
LOCATION '/twitteranalytics/base/';
CREATE EXTERNAL TABLE incremental_tweets4 (
`id` BIGINT,
created_at STRING,
`source` STRING,
favorited BOOLEAN,
retweet_count INT,
retweeted_status STRUCT<
text:STRING,
`user`:STRUCT<screen_name:STRING,name:STRING>>,
`entities` STRUCT<
urls:ARRAY<STRUCT<expanded_url:STRING>>,
user_mentions:ARRAY<STRUCT<screen_name:STRING,name:STRING>>,
hashtags:ARRAY<STRUCT<text:STRING>>>,
text STRING,
`user` STRUCT<
screen_name:STRING,
name:STRING,
friends_count:INT,
followers_count:INT,
statuses_count:INT,
verified:BOOLEAN,
utc_offset:INT,
time_zone:STRING>,
in_reply_to_screen_name STRING
)
ROW FORMAT SERDE 'com.cloudera.hive.serde.JSONSerDe'
LOCATION '/twitteranalytics/incremental/';
CREATE VIEW reconcile_view AS
SELECT t1.* FROM
(SELECT * FROM base_tweets4
UNION ALL
SELECT * FROM incremental_tweets4) t1
JOIN
(SELECT id FROM
(SELECT * FROM base_tweets4
UNION ALL
SELECT * FROM incremental_tweets4) t2
GROUP BY id) s
ON t1.id = s.id
CREATE TABLE candidate_score (
candidate_name STRING,
sentiment_score DOUBLE
)
ROW FORMAT SERDE 'com.cloudera.hive.serde.JSONSerDe'
LOCATION '/twitteranalytics/candidate_score/';
在执行上述脚本时,我收到如下错误,
Logging initialized using configuration in jar:file:/usr/lib/hive/lib/hive
common-1.1.0-cdh5.13.0.jar!/hive-log4j.properties
OK
Time taken: 49.294 seconds
OK
Time taken: 3.19 seconds
FAILED: ParseException line 21:0 missing EOF at 'CREATE' near 'id'
WARN: The method class
org.apache.commons.logging.impl.SLF4JLogFactory#release() was invoked.
WARN: Please see http://www.slf4j.org/codes.html#release for an explanation.
在不同的博客中搜索时,我发现这可能是错误,因为key words
被用作变量的名称,这可以通过将backticks
添加到变量名来解决。但这似乎并没有奏效。我可能会遗漏一些让我犯这个错误的东西。
答案 0 :(得分:1)
我尝试了两张桌子的DDL,它对我没有任何修改。你可以再试一次吗?如果可能的话请附上JSon文件,以便我可以尝试端到端。