我正在尝试使用在网上找到的以下架构在hive 3.0中创建表:
CREATE TABLE tweets (
id BIGINT,
created_at STRING,
source STRING,
favorited BOOLEAN,
retweeted_status STRUCT< text : STRING, user : STRUCT<screen_name : STRING,name : STRING>, retweet_count : INT>,
entities STRUCT< urls : ARRAY<STRUT<expanded_url : STRING>>,
user_mentions : ARRAY<STRUCT<screen_name : STRING,name : STRING>>,
hashtags : ARRAY<STRUCT<text : STRING>>>,
text STRING,
user STRUCT< screen_name : STRING, name : STRING, friends_count : INT, followers_count : INT, statuses_count : INT, verified : BOOLEAN, utc_offset : INT, time_zone : STRING>,
in_reply_to_screen_name STRING
)
ROW FORMAT SERDE 'org.apache.hive.hcatalog.data.JSONSerDe';
当我按下输入NoViableAltException时。我是第一次使用蜂巢,没有经验,有人可以告诉我该架构有什么问题吗?
答案 0 :(得分:1)
用户为 Reserved keyword ,以防万一,如果我们在配置单元中使用关键字,则需要用`(反引号)
示例:
`用户`
尝试在下面创建表语句
desc tweets;
+--------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------+--------------------+--+
| col_name | data_type | comment |
+--------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------+--------------------+--+
| id | bigint | from deserializer |
| created_at | string | from deserializer |
| source | string | from deserializer |
| favorited | boolean | from deserializer |
| retweeted_status | struct<text:string,user:struct<screen_name:string,name:string>,retweet_count:int> | from deserializer |
| entities | struct<urls:array<struct<expanded_url:string>>,user_mentions:array<struct<screen_name:string,name:string>>,hashtags:array<struct<text:string>>> | from deserializer |
| text | string | from deserializer |
| user | struct<screen_name:string,name:string,friends_count:int,followers_count:int,statuses_count:int,verified:boolean,utc_offset:int,time_zone:string> | from deserializer |
| in_reply_to_screen_name | string | from deserializer |
+--------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------+--------------------+--+
我能够使用上述ddl创建表:
/user/flume/tweets/
更新:
配置单元在运行 select语句时作为读取模式,配置单元在表指向的目录中查找文件(/ user / hive / warehouse / tweets /),然后根据您的 ddl语句读取这些数据,但是在这种情况下,目录中不存在数据,因此select语句不会返回任何记录。
要解决此问题:
选项1。。将数据从/user/hive/warehouse/tweets/
移到`hadoop fs -mv /user/flume/tweets/ /user/hive/warehouse/tweets/`
目录中,然后就可以从表中选择数据了。
/user/flume/tweets/
(或)
选项2。我们需要在Lobibox
这个目录的顶部创建配置单元表,然后才能在tweets表中查看数据(为此使用上面的create table语句)。