在Hive中加载数据后无法获得所需的结果

时间:2017-06-30 12:14:47

标签: hadoop hive hiveql

我在加载数据时无法显示genre列值,如下所示:

1::Toy Story (1995)::Animation|Children's|Comedy
2::Jumanji (1995)::Adventure|Children's|Fantasy
3::Grumpier Old Men (1995)::Comedy|Romance
4::Waiting to Exhale 1995)::Comedy|Drama
5::Father of the Bride Part II (1995)::Comedy
6::Heat (1995)::Action|Crime|Thriller
7::Sabrina (1995)::Comedy|Romance

我尝试将表创建为DDl: -

create table movie(movie_id int, movie_name string , genre string) row format delimited fields terminated by '::';

并且

create table movie(movie_id int, movie_name string , genre array<string>) row format delimited fields terminated by '::' collection items terminated by '|';

但是类型栏仍然是空白。

请帮助正确创建表格,以便放置数据并获得所需的结果。

输出是 -

hive> select * from movie;

OK
1               Toy Story (1995)
2               Jumanji (1995)
3               Grumpier Old Men (1995)
4               Waiting to Exhale 1995)
5               Father of the Bride Part II (1995)
6               Heat (1995)
7               Sabrina (1995)

1 个答案:

答案 0 :(得分:0)

您必须使用Multi delimiter SerDe

P.S。
您可以考虑将genre加载为array<string>而不是string

create external table movie
(
    movie_id    int
   ,movie_name  string 
   ,genre       array<string>
) 
    row format serde 'org.apache.hadoop.hive.contrib.serde2.MultiDelimitSerDe'
    with serdeproperties 
    (
        "field.delim"       = "::"
       ,"collection.delim"  = "|"
    )
;
select * from movie
;
+----------+------------------------------------+--------------------------------------+
| movie_id |             movie_name             |                genre                 |
+----------+------------------------------------+--------------------------------------+
|        1 | Toy Story (1995)                   | ["Animation","Children's","Comedy"]  |
|        2 | Jumanji (1995)                     | ["Adventure","Children's","Fantasy"] |
|        3 | Grumpier Old Men (1995)            | ["Comedy","Romance"]                 |
|        4 | Waiting to Exhale 1995)            | ["Comedy","Drama"]                   |
|        5 | Father of the Bride Part II (1995) | ["Comedy"]                           |
|        6 | Heat (1995)                        | ["Action","Crime","Thriller"]        |
|        7 | Sabrina (1995)                     | ["Comedy","Romance"]                 |
+----------+------------------------------------+--------------------------------------+