Question

输入数据

+----------------------+--------------------------------+
|      movie_name      |             Genres             |
+----------------------+--------------------------------+
| digimon              | Adventure|Animation|Children's |
| Slumber_Party_Massac | Horror                         |
+----------------------+--------------------------------+

我需要像

这样的输出

+----------------------+--------------------------------+-----------------+
|      movie_name      |             Genres             | count_of_genres |
+----------------------+--------------------------------+-----------------+
| digimon              | Adventure|Animation|Children's |               3 |
| Slumber_Party_Massac | Horror                         |               1 |
+----------------------+--------------------------------+-----------------+

Answer 1

select  *
       ,size(split(coalesce(Genres,''),'[^|\\s]+'))-1  as count_of_genres

from    mytable

此解决方案涵盖不同的用例，包括 -

NULL值
空字符串
空标记（例如Adventure||Animation或Adventure| |Animation）

Answer 2

这是存储数据的一种非常非常糟糕的方式。你应该有一个单独的MovieGenres表，每部电影和每个类型都有一行。

一种方法是使用length()和replace()：

select t.*,
       (1 + length(genres) - length(replace(genres, '|', ''))) as num_genres
from t;

这假设每部电影至少有一种类型。如果没有，你也需要测试它。

如何计算由“|”分隔的每列中的单词数使用蜂巢的分离器？

2 个答案: