如何使用REGEXP_SUBSTR函数从字符串中提取子字符串?

时间:2019-06-12 07:45:19

标签: sql regex teradata

我需要从错误日志中提取单词和单词序列。

下面的日志示例:

2019.06.08 14:32:36 ERR 10298587    2019-06-07  PROJECT_NAME    script.sql  4483    2646 HY000 [NCR] [Teradata DBMS] : No more spool space in aload50.
2019.06.08 14:32:36 ERR 10298587    2019-06-07  PROJECT_NAME    script.sql  4483    2646 HY000 [NCR] [Teradata DBMS] : No more spool space in aload50. (ef)
2019.06.08 14:32:36 ERR 10298587    2019-06-07  PROJECT_NAME    script.sql  4483    2646 HY000 [NCR] [Teradata DBMS] : No more spool space in dload50.
2019.06.08 14:32:36 ERR 10298587    2019-06-07  PROJECT_NAME    script.sql  4483    2646 HY000 [NCR] [Teradata DBMS] : No more spool space in dload50. (ef)
message=[NCR] [Teradata DBMS] : No more spool space in aload50. (ef)
message=[NCR] [Teradata DBMS] : No more spool space in dload50. (ef)
message=[NCR] [Teradata DBMS] : No more spool space in aload50. (ee)
message=[NCR] [Teradata DBMS] : No more spool space in dload50. (ee)

我需要提取子字符串:

error_log:

  

[Teradata DBMS]:aload50中没有更多的后台处理空间。

没有(例如)

和用户名: 例如:

  

aload50

用户名可以是:

aload01至aload999

dload01至dload999

select 
REGEXP_SUBSTR('2019.06.08 14:32:36  ERR 10298587    2019-06-07  PROJECT_NAME    script.sql  4483    2646 HY000 [NCR] error_message[Teradata DBMS] : No more spool space in aload50.',' regexp_for_error_log') AS error_log,
REGEXP_SUBSTR('2019.06.08 14:32:36  ERR 10298587    2019-06-07  PROJECT_NAME    script.sql  4483    2646 HY000 [NCR] [Teradata DBMS] : No more spool space in aload50.',' regexp_for_user_name') AS user_name,
FROM DUAL;

1 个答案:

答案 0 :(得分:2)

我们可以在此处将REGEXP_REPLACE与捕获组一起使用:

SELECT
    REGEXP_REPLACE(log, '.*(\[Teradata DBMS\] : .* [^.]+)\..*', '\1') AS error_log,
    REGEXP_REPLACE(log, '.*\[Teradata DBMS\] : .* ([^.]+)\..*', '\1') AS user_name
FROM yourTable;

Demo