Redshift,计数用逗号分隔的列中的项目

时间:2019-10-08 01:03:10

标签: sql amazon-redshift

我有一个列保存一组数字的数据

code

我知道我可以拆分列并进行计数,但是我们有什么其他方法可以计数数字吗?

|  user   |      col                     |
| ------- |         -------              |
| 1       | 3,7,11,25,44,56,77,32,34,55  |
| 2       | 3,7,25,44,37,89,56,99,103,13 |
| 1       | 3,10,11,25,44,56,33,32,34,55  |

4 个答案:

答案 0 :(得分:0)

这回答了问题的原始版本。

您可以使用以下方式计算逗号分隔值的数量:

select (case when col = '' then 0
             else length(col) - length(replace(col, ',', '')) + 1
        end) as values_count       
from t;

也就是说,您应该修复数据模型,以免在列中存储多个值。您也将数字存储为字符串特别令人讨厌。您需要一个联结/关联表。

答案 1 :(得分:0)

您可以将联合查询与SPLIT_PART一起使用:

WITH cte AS (
    SELECT user, SPLIT_PART(col, ',', 1) AS val FROM yourTable UNION ALL
    SELECT user, SPLIT_PART(col, ',', 2) FROM yourTable UNION ALL
    SELECT user, SPLIT_PART(col, ',', 3) FROM yourTable UNION ALL
    SELECT user, SPLIT_PART(col, ',', 4) FROM yourTable UNION ALL
    SELECT user, SPLIT_PART(col, ',', 5) FROM yourTable UNION ALL
    SELECT user, SPLIT_PART(col, ',', 6) FROM yourTable UNION ALL
    SELECT user, SPLIT_PART(col, ',', 7) FROM yourTable UNION ALL
    SELECT user, SPLIT_PART(col, ',', 8) FROM yourTable UNION ALL
    SELECT user, SPLIT_PART(col, ',', 9) FROM yourTable UNION ALL
    SELECT user, SPLIT_PART(col, ',', 10) FROM yourTable
)

SELECT
    user,
    val,
    COUNT(*) AS cnt
FROM cte
GROUP BY
    user,
    val;

但是请注意,我们在CTE中所做的实际上只是规范化您的数据,以使每个用户价值关系占用一个单独的记录。理想情况下,您应该更改表格设计,并远离存储CSV。

如果您只希望每个用户的数量计数,请使用:

SELECT
    user,
    COUNT(*) AS cnt
FROM cte
GROUP BY
    user;

答案 2 :(得分:0)

查询。

with t as (
select 1 as user, '3,7,11,25,44,56,77,32,34,55' as col 
union all
select 2 as user, '3,7,25,44,37,89,56,99,103,13' as col
union all
select 1 as user, '3,10,11,25,44,56,33,32,34,55' as col
)
select a.user, a.val, count(*) as cnt
from (
    select a.user
      , SPLIT_PART(a.col, ',', b.no) as val
    from t a
    cross join (
      select * from generate_series(1,10) as no
    ) b
) a
group by a.user, a.val
order by a.user, a.val

答案 3 :(得分:0)

使用REGEXP_COUNT计算字符串中的逗号数并加1。

CREATE TEMP TABLE examples (
      user_id INT
    , value_list VARCHAR 
);
INSERT INTO examples
          SELECT 1 , '3,7,11,25,44,56,77,32,34,55' 
UNION ALL SELECT 2 , '3,7,25,44,37,89,56,99,103,13'
UNION ALL SELECT 1 , '3,10,11,25,44,56,33,32,34,55'
;
SELECT user_id
     , SUM(REGEXP_COUNT(value_list,',')+1) value_count 
FROM examples
GROUP BY 1
;

输出

 user_id | value_count
---------+-------------
       1 |          20
       2 |          10