我在加载数据时无法显示genre
列值,如下所示:
1::Toy Story (1995)::Animation|Children's|Comedy
2::Jumanji (1995)::Adventure|Children's|Fantasy
3::Grumpier Old Men (1995)::Comedy|Romance
4::Waiting to Exhale 1995)::Comedy|Drama
5::Father of the Bride Part II (1995)::Comedy
6::Heat (1995)::Action|Crime|Thriller
7::Sabrina (1995)::Comedy|Romance
我尝试将表创建为DDl: -
create table movie(movie_id int, movie_name string , genre string) row format delimited fields terminated by '::';
并且
create table movie(movie_id int, movie_name string , genre array<string>) row format delimited fields terminated by '::' collection items terminated by '|';
但是类型栏仍然是空白。
请帮助正确创建表格,以便放置数据并获得所需的结果。
输出是 -
hive> select * from movie;
OK
1 Toy Story (1995)
2 Jumanji (1995)
3 Grumpier Old Men (1995)
4 Waiting to Exhale 1995)
5 Father of the Bride Part II (1995)
6 Heat (1995)
7 Sabrina (1995)
答案 0 :(得分:0)
P.S。
您可以考虑将genre
加载为array<string>
而不是string
create external table movie
(
movie_id int
,movie_name string
,genre array<string>
)
row format serde 'org.apache.hadoop.hive.contrib.serde2.MultiDelimitSerDe'
with serdeproperties
(
"field.delim" = "::"
,"collection.delim" = "|"
)
;
select * from movie
;
+----------+------------------------------------+--------------------------------------+
| movie_id | movie_name | genre |
+----------+------------------------------------+--------------------------------------+
| 1 | Toy Story (1995) | ["Animation","Children's","Comedy"] |
| 2 | Jumanji (1995) | ["Adventure","Children's","Fantasy"] |
| 3 | Grumpier Old Men (1995) | ["Comedy","Romance"] |
| 4 | Waiting to Exhale 1995) | ["Comedy","Drama"] |
| 5 | Father of the Bride Part II (1995) | ["Comedy"] |
| 6 | Heat (1995) | ["Action","Crime","Thriller"] |
| 7 | Sabrina (1995) | ["Comedy","Romance"] |
+----------+------------------------------------+--------------------------------------+