无法调试HQL脚本

时间:2018-05-04 03:08:34

标签: hadoop hive hiveql

所以我正在尝试创建一个Hive Schema来分析存储在hdfs中的json数据。我指的是this blog用于创建Hive表,下面是我的Schema.hql

CREATE EXTERNAL TABLE base_tweets4 (
`id` BIGINT,
created_at STRING,
`source` STRING,
favorited BOOLEAN,
retweet_count INT,
 retweeted_status STRUCT<
  text:STRING,
  `user`:STRUCT<screen_name:STRING,name:STRING>>,
`entities` STRUCT<
  urls:ARRAY<STRUCT<expanded_url:STRING>>,
  user_mentions:ARRAY<STRUCT<screen_name:STRING,name:STRING>>,
  hashtags:ARRAY<STRUCT<text:STRING>>>,
 text STRING,
`user` STRUCT<
  screen_name:STRING,
  name:STRING,
  friends_count:INT,
  followers_count:INT,
  statuses_count:INT,
  verified:BOOLEAN,
  utc_offset:INT,
  time_zone:STRING>,
in_reply_to_screen_name STRING
)
ROW FORMAT SERDE 'com.cloudera.hive.serde.JSONSerDe'
LOCATION '/twitteranalytics/base/';


CREATE EXTERNAL TABLE incremental_tweets4 (
 `id` BIGINT,
  created_at STRING,
 `source` STRING,
  favorited BOOLEAN,
  retweet_count INT,
  retweeted_status STRUCT<
  text:STRING,
  `user`:STRUCT<screen_name:STRING,name:STRING>>,
  `entities` STRUCT<
  urls:ARRAY<STRUCT<expanded_url:STRING>>,
  user_mentions:ARRAY<STRUCT<screen_name:STRING,name:STRING>>,
  hashtags:ARRAY<STRUCT<text:STRING>>>,
  text STRING,
  `user` STRUCT<
  screen_name:STRING,
  name:STRING,
  friends_count:INT,
  followers_count:INT,
  statuses_count:INT,
  verified:BOOLEAN,
  utc_offset:INT,
  time_zone:STRING>,
  in_reply_to_screen_name STRING
 )
ROW FORMAT SERDE 'com.cloudera.hive.serde.JSONSerDe'
LOCATION '/twitteranalytics/incremental/';



CREATE VIEW reconcile_view AS
SELECT t1.* FROM
(SELECT * FROM base_tweets4
 UNION ALL
 SELECT * FROM incremental_tweets4) t1
 JOIN
 (SELECT id FROM
  (SELECT * FROM base_tweets4
  UNION ALL
   SELECT * FROM incremental_tweets4) t2
   GROUP BY id) s
     ON t1.id = s.id




  CREATE TABLE candidate_score (
  candidate_name STRING,
   sentiment_score DOUBLE
   )
  ROW FORMAT SERDE 'com.cloudera.hive.serde.JSONSerDe'
  LOCATION '/twitteranalytics/candidate_score/';

在执行上述脚本时,我收到如下错误,

Logging initialized using configuration in jar:file:/usr/lib/hive/lib/hive 
common-1.1.0-cdh5.13.0.jar!/hive-log4j.properties
OK
Time taken: 49.294 seconds
OK
Time taken: 3.19 seconds
FAILED: ParseException line 21:0 missing EOF at 'CREATE' near 'id'
WARN: The method class 
org.apache.commons.logging.impl.SLF4JLogFactory#release() was invoked.
WARN: Please see http://www.slf4j.org/codes.html#release for an explanation.

在不同的博客中搜索时,我发现这可能是错误,因为key words被用作变量的名称,这可以通过将backticks添加到变量名来解决。但这似乎并没有奏效。我可能会遗漏一些让我犯这个错误的东西。

1 个答案:

答案 0 :(得分:1)

我尝试了两张桌子的DDL,它对我没有任何修改。你可以再试一次吗?如果可能的话请附上JSon文件,以便我可以尝试端到端。