我的表格中有几行,如下所示:
row1: abc changed on 12 November, 2008 11:30 AM and its abc..region1
row2: defg updated 14 January, 2012 08:20 PM ......region2
row3: ghijkl corrected by 18 august, 2013 9:30 AM ..something..region3
我的要求如下:
12 dec 2016 7:30 AM
。 所以我构建的查询是(以row1为例),如下所示:
select regexp_replace(
'abc changed on 12 November, 2008 11:30 AM and its abc..region1',
'([0-9]{2})([[:blank:]]) (January|February|March|April|May|June|July|August|September|October|November|December)(,[[:blank:]])([0-9]{4})([[:blank:]])([0-9]{2}:[0-9]{2})([[:blank:]])(AM|PM)','\1-\3-\5 \7 \9',1,0,'i')
输出:
abc changed on 12-November-2008 11:30 AM and its abc..region1
所以我对上面的查询感到满意,因为我得到了一个字符串 格式化的日期。即使这不是最后的约会 格式,我可以用这个日期传递给一些转换的函数 这个日期根据该地区做了一些处理和fianlly 返回日期类型。出于同样的目的,我在上面添加了to_date 查询:
select regexp_replace(
'abc changed on 12 November, 2008 11:30 AM and its abc..region1',
'([0-9]{2})([[:blank:]]) (January|February|March|April|May|June|July|August|September|October|November|December)(,[[:blank:]])([0-9]{4})([[:blank:]])([0-9]{2}:[0-9]{2})([[:blank:]])(AM|PM)',
substr('\1-\3-\5 \7 \9',1),
1,0,'i')
输出:
abc changed on 12-November-2008 11:30 AM and its
abc..region1 --> works fine till here
现在我添加to_date将日期字符串类型转换为实际日期 键入以对其进行一些处理:
select regexp_replace(
'abc changed on 12 November, 2008 11:30 AM and its abc..region1',
'([0-9]{2})([[:blank:]]) (January|February|March|April|May|June|July|August|September|October|November|December)(,[[:blank:]])([0-9]{4})([[:blank:]])([0-9]{2}:[0-9]{2})([[:blank:]])(AM|PM)',
to_date(substr('\1-\3-\5 \7 \9',1),'dd-mon-yyyy HH:MI AM'),
1,0,'i')
此查询给出了一个错误:
ORA-01858: a non-numeric character found where a numeric was expected
我检查了是否传递了错误的参数
to_date()
,并解决了下面的问题,但效果很好。
Select to_date('12-November-2008 11:30 AM','dd-mon-yyyy HH:MI AM')
from dual;
输出:
12-Nov-2008
(我并不担心时间戳,因为在这个日期它将会是无论如何)
为了避免混淆,我编号了上面正则表达式的子串:
([0-9]{2})-->1 ([[:blank:]])-->2
(January|February|March|April|May|June|July|August|September|October|November|December)-->3
(,[[:blank:]]) - > 4([0-9] {4}) - > 5([[:blank:]]) - > 6 ([0-9] {2}:[0-9] {2}) - > 7([[:blank:]]) - > 8(AM | PM) - > 9
select regexp_replace(
'abc changed on 12 November, 2008 11:30 AM and its abc..region1',
'([0-9]{2})([[:blank:]]) (January|February|March|April|May|June|July|August|September|October|November|December)(,[[:blank:]])([0-9]{4})([[:blank:]])([0-9]{2}:[0-9]
{2})([[:blank:]])(AM|PM)','\1-\3-\5 \7 \9',1,0,'i')
答案 0 :(得分:2)
假设您的字符串始终以该特定格式包含日期(并且没有无效日期等),那么以下内容适用于您:
WITH sample_data AS (SELECT ' the date is 12 November, 2008 11:30 AM' str FROM dual UNION ALL
SELECT 'Here''s a date of 1 March, 2015 1:43 pm' str FROM dual UNION ALL
SELECT '1 February,2016 9:43 AM' str FROM dual UNION ALL
SELECT 'And again it''s 21 May, 2016 9:43 AM and a little bit extra' str FROM dual)
SELECT str,
to_date(regexp_replace(str, '^.*?([[:digit:]]{1,2} [[:alpha:]]{3,9}, ?[[:digit:]]{4} [[:digit:]]{1,2}\:[[:digit:]]{2} (A|P)M).*$', '\1', 1, 1, 'i'), 'dd Month yyyy, hh:mi am') dt
FROM sample_data;
STR DT
---------------------------------------------------------- -------------------
the date is 12 November, 2008 11:30 AM 12/11/2008 11:30:00
Here's a date of 1 March, 2015 1:43 pm 01/03/2015 13:43:00
1 February,2016 9:43 AM 01/02/2016 09:43:00
And again it's 21 May, 2016 9:43 AM and a little bit extra 21/05/2016 09:43:00
正则表达式可以按如下方式细分:
^.*?
- 尽可能少地从行首开始匹配任何字符(新行除外),可能为0或更多。([[:digit:]]{1,2} [[:alpha:]]{3,9}, ?[[:digit:]]{4} [[:digit:]]{1,2}\:[[:digit:]]{2} (A|P)M)
- 这是我们正在寻找的模式,我们将用它来替换整个字符串(这是别名为\1
,我们可以将其传递给替换字符串参数)。.*$
- 匹配字符串末尾的任何字符模式的第二部分可以进一步细分为:
[[:digit:]]{1,2}
- 一位或两位数字
- 单个空格字符[[:alpha:]]{3,9}
- 三到九个字母(大写或小写), ?
- 逗号后跟0或1个空格[[:digit:]]{4}
- 四位数字
- 单个空格字符[[:digit:]]{1,2}
- 一位或两位数字\:
- 单个冒号字符[[:digit:]]{1,2}
- 两位数
- 单个空格字符(A|P)M
- 字母A或P后跟M 这应该适合你:
WITH sample_data AS (SELECT 'abc changed on 12 November, 2008 11:30 AM and its abc..region1' str FROM dual UNION ALL
SELECT 'defg updated 14 January, 2012 08:20 PM ......region2' str FROM dual UNION ALL
SELECT 'ghijkl corrected by 18 august, 2013 9:30 AM ..something..region3' str FROM dual)
SELECT str,
regexp_replace(str,
'(^.*?)(([[:digit:]]{1,2}) (January|February|March|April|May|June|July|August|September|October|November|December), (?[[:digit:]]{4} [[:digit:]]{1,2}\:[[:digit:]]{2} (A|P)M))(.*$)',
'\1\3-\4-\5\7', 1, 1, 'i') dt
FROM sample_data;
STR DT
------------------------------------------------------------------- --------------------------------------------------------------------------------
abc changed on 12 November, 2008 11:30 AM and its abc..region1 abc changed on 12-November-2008 11:30 AM and its abc..region1
defg updated 14 January, 2012 08:20 PM ......region2 defg updated 14-January-2012 08:20 PM ......region2
ghijkl corrected by 18 august, 2013 9:30 AM ..something..region3 ghijkl corrected by 18-august-2013 9:30 AM ..something..region3