Hive中的子字符串提取

时间:2016-08-19 11:32:05

标签: regex hive

我们有一个字符串:

ABC.XXXXXXX.USD.XX

,任务是提取货币(在这种情况下是美元)。 我试过的选项回归废话:

select distinct 
    r.name
    ,regexp_extract(r.name,'\.(.{3})\.',1)
    ,split(r.name,'\.')
    ,split(r.name,'\.')[2]
from sales r

输出:

 ABC.XXXXXXX.USD.XX    BC.   ["","","","","","","","","","","","",""]  <empty>

怎么回事?

2 个答案:

答案 0 :(得分:0)

Hive中的import ch.qos.logback.classic.Logger import ch.qos.logback.core.ConsoleAppender as Console import static org.slf4j.LoggerFactory.getLogger import static org.slf4j.Logger.ROOT_LOGGER_NAME as ROOT import static ch.qos.logback.classic.Level.WARN ((Console) ((Logger) getLogger(ROOT)).getAppender("console")).setOutputStream(System.err) // one-liner ((ch.qos.logback.core.ConsoleAppender) ((ch.qos.logback.classic.Logger) org.slf4j.LoggerFactory.getLogger(ch.qos.logback.classic.Logger.ROOT_LOGGER_NAME)).getAppender("console")).setOutputStream(System.err) 函数基于正则表达式。请参阅language manual on the wiki

split

您将分隔符指定为split(string str, string pat) Split str around pat (pat is a regular expression) ,这是一个未定义的转义序列,解析为仅"\.",在正则表达式中匹配任何字符,因此,它在每个字符,以及字符串的开头和结尾。

您需要使用文字反斜杠转义点,该反斜杠可以在字符串文字中定义为.

答案 1 :(得分:0)

把它放在方括号中,如:

<div class="nice-select demoBasic" tabindex="0">
    <span class="current">Please select</span>
    <ul class="list">
        <li class="option" data-value="1">Edit</li>
        <li class="option" data-value="2">Delete</li>
    </ul>
</div>