如何使用配置单元计算字符串中的唯一整数?

时间:2016-10-10 21:16:20

标签: sql regex hadoop hive

尝试计算字符串中的唯一字节数?

DATA(电话号码,例如只有数字字节):

1234567890
1111111112

结果:

10
2

我已经尝试了下面的内容并且它没有用,因为sum()不会接受UDF' if ',我认为。

 select phone
 , sum(
        cast(if(length(regexp_replace(phone,'0',''))<10,'1','0') as int) +
        cast(if(length(regexp_replace(phone,'1',''))<10,'1','0') as int) +
        cast(if(length(regexp_replace(phone,'2',''))<10,'1','0') as int) +
        cast(if(length(regexp_replace(phone,'3',''))<10,'1','0') as int) +
        cast(if(length(regexp_replace(phone,'4',''))<10,'1','0') as int) +
        cast(if(length(regexp_replace(phone,'5',''))<10,'1','0') as int) +
        cast(if(length(regexp_replace(phone,'6',''))<10,'1','0') as int) +
        cast(if(length(regexp_replace(phone,'7',''))<10,'1','0') as int) +
        cast(if(length(regexp_replace(phone,'8',''))<10,'1','0') as int) +
        cast(if(length(regexp_replace(phone,'9',''))<10,'1','0') as int)         
       )  as unique_bytes
 from table;

我也不会将正则表达式作为解决方案。

1 个答案:

答案 0 :(得分:2)

使用+。 。 。但是像这样:

select phone,
       ((case when phone like '%0%' then 1 else 0 end) +
        (case when phone like '%1%' then 1 else 0 end) +
        (case when phone like '%2%' then 1 else 0 end) +
        (case when phone like '%3%' then 1 else 0 end) +
        (case when phone like '%4%' then 1 else 0 end) +
        (case when phone like '%5%' then 1 else 0 end) +
        (case when phone like '%6%' then 1 else 0 end) +
        (case when phone like '%7%' then 1 else 0 end) +
        (case when phone like '%8%' then 1 else 0 end) +
        (case when phone like '%9%' then 1 else 0 end) +
       ) as ints
 from table;

您的代码有几个问题:

  • sum()是一个聚合函数,不需要。
  • if()正在返回字符串,但您要将这些值一起添加。
  • 我不确定您使用regexp_replace()而非replace()的原因。