无法在配置单元中创建表

时间:2018-10-07 15:12:05

标签: hive ubuntu-16.04 hiveql

我正在尝试使用在网上找到的以下架构在hive 3.0中创建表:

    CREATE TABLE tweets (
id BIGINT,
created_at STRING,
source STRING,
favorited BOOLEAN,
retweeted_status STRUCT< text : STRING, user : STRUCT<screen_name : STRING,name : STRING>, retweet_count : INT>,
entities STRUCT< urls : ARRAY<STRUT<expanded_url : STRING>>,
user_mentions : ARRAY<STRUCT<screen_name : STRING,name : STRING>>,
hashtags : ARRAY<STRUCT<text : STRING>>>,
text STRING,
user STRUCT< screen_name : STRING, name : STRING, friends_count : INT, followers_count : INT, statuses_count : INT, verified : BOOLEAN, utc_offset : INT, time_zone : STRING>, 
in_reply_to_screen_name STRING
) 
ROW FORMAT SERDE 'org.apache.hive.hcatalog.data.JSONSerDe';

enter image description here

当我按下输入NoViableAltException时。我是第一次使用蜂巢,没有经验,有人可以告诉我该架构有什么问题吗?

1 个答案:

答案 0 :(得分:1)

用户 Reserved keyword ,以防万一,如果我们在配置单元中使用关键字,则需要`(反引号)

示例:

  

`用户`

尝试在下面创建表语句

desc tweets;
+--------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------+--------------------+--+
|         col_name         |                                                                     data_type                                                                     |      comment       |
+--------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------+--------------------+--+
| id                       | bigint                                                                                                                                            | from deserializer  |
| created_at               | string                                                                                                                                            | from deserializer  |
| source                   | string                                                                                                                                            | from deserializer  |
| favorited                | boolean                                                                                                                                           | from deserializer  |
| retweeted_status         | struct<text:string,user:struct<screen_name:string,name:string>,retweet_count:int>                                                                 | from deserializer  |
| entities                 | struct<urls:array<struct<expanded_url:string>>,user_mentions:array<struct<screen_name:string,name:string>>,hashtags:array<struct<text:string>>>   | from deserializer  |
| text                     | string                                                                                                                                            | from deserializer  |
| user                     | struct<screen_name:string,name:string,friends_count:int,followers_count:int,statuses_count:int,verified:boolean,utc_offset:int,time_zone:string>  | from deserializer  |
| in_reply_to_screen_name  | string                                                                                                                                            | from deserializer  |
+--------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------+--------------------+--+

我能够使用上述ddl创建表:

/user/flume/tweets/
  

更新:

配置单元在运行 select语句时作为读取模式,配置单元在表指向的目录中查找文件(/ user / hive / warehouse / tweets /),然后根据您的 ddl语句读取这些数据,但是在这种情况下,目录中不存在数据,因此select语句不会返回任何记录。

要解决此问题:

选项1。。将数据从/user/hive/warehouse/tweets/移到`hadoop fs -mv /user/flume/tweets/ /user/hive/warehouse/tweets/` 目录中,然后就可以从表中选择数据了。

/user/flume/tweets/

(或)

选项2。我们需要在Lobibox这个目录的顶部创建配置单元表,然后才能在tweets表中查看数据(为此使用上面的create table语句)。