I am having an issue trying to convert multiple dates to one defined format. We are receiving the multiple dates from another DB source so I do not have control of the formatting until it reaches ours.
Here are all the formats:
YYYYMMDD
YYYY-MM-DD HH:MM:SS
MM/DD/YYYY
MM-DD-YYYY
Abrieviated Day Month DD HH:MM:SS TimeZone YYYY ('Thu Feb 02 20:49:59 MSK 2012')
Fully written Day, Month DD, YYYY HH:MM:SS AM/PM
My requirement is to set them all to the standard MM/DD/YYYY format or null. Any ideas?
Thank you.
答案 0 :(得分:2)
我建议使用带有regexp_like条件的case语句来检测可能的格式并使用then子句中的相应日期掩码返回日期,例如:
with tz as (
SELECT distinct tzabbrev
, first_value(min(tzname)) over (partition by tzabbrev order by count(*) desc) tzname
FROM v$timezone_names
group by tzabbrev
, TZ_OFFSET(tzname)
), dta as (
select yt.install_date
, regexp_replace(yt.install_date,tzabbrev,tzname,1,1,'i') install_date2
from your_table yt
left join tz
on regexp_like(install_date, tz.TZABBREV,'i')
)
select install_date, install_date2
, to_timestamp_tz( install_date2
, case
when regexp_like(install_date2,'^[A-Z]{3,} [A-Z]{3,} [0-9]{1,2} [0-9]{1,2}(:[0-9]{2}){1,2} [[:print:]]{5,} [0-9]{2,4}','i') then 'DY MON DD HH24:MI:SS TZR YYYY'
when regexp_like(install_date2,'^[A-Z]{4,},? [A-Z]{3,},? [0-9]{1,2},? [0-9]{2,4}','i') then 'DAY MONTH DD YYYY'
when regexp_like(install_date2,'^[A-Z]{3},? [A-Z]{3,},? [0-9]{1,2},? [0-9]{2,4}','i') then 'DY MONTH DD YYYY'
when regexp_like(install_date2,'^[0-9]{1,2}[-/][0-9]{1,2}[-/]([0-9]{2}){1,2}') then 'MM-DD-RRRR'
when regexp_like(install_date2,'^[0-9]{1,2}[-/ ][A-Z]{3,}[-/ ]([0-9]{2}){1,2}','i') then 'DD-MON-RRRR'
when regexp_like(install_date2,'^[A-Z]{3,}[-/ ][0-9]{1,2},?[-/ ]([0-9]{2}){1,2}','i') then 'MON-DD-RRRR'
when regexp_like(install_date2,'^(19|20)[0-9]{6}') then 'RRRRMMDD'
when regexp_like(install_date2,'^[23][0-9]{5}') then 'DDMMRR'
when regexp_like(install_date2,'^[0-9]{6}') then 'MMDDRR'
when regexp_like(install_date2,'^[01][0-9]{7}') then 'MMDDRRRR'
when regexp_like(install_date2,'^[23][0-9]{7}') then 'DDMMRRRR'
ELSE NULL
end
||case
when regexp_like(install_date2, '[0-9]{1,2}(:[0-9]{2}){1,2}$') then ' HH24:MI:SS'
when regexp_like(install_date2, '[0-9]{1,2}(:[0-9]{2}){1,2} ?(am|pm)$','i') then ' HH:MI:SS AM'
else null
end
)
Install_Time_Stamp
from dta;
我遇到时区缩写问题,所以我添加了一个步骤,首先用时区区域替换它们。
答案 1 :(得分:2)
您可以定义转换函数,基本上按顺序处理每种格式:
create or replace function translate_date(i_date_string VARCHAR2) return date as
begin
-- you may optimize to not to go in all blocks based on the string format
-- order the blocks on the expected frequency
begin
return to_date(i_date_string,'yyyymmdd');
EXCEPTION
WHEN OTHERS THEN NULL;
end;
begin
return to_date(i_date_string,'yyyy/mm/dd');
EXCEPTION
WHEN OTHERS THEN NULL;
end;
begin
return to_date(i_date_string,'yyyy-mm-dd');
EXCEPTION
WHEN OTHERS THEN NULL;
end;
begin
return to_date(i_date_string,'yyyy-mm-dd hh24:mi:ss');
EXCEPTION
WHEN OTHERS THEN NULL;
end;
begin
-- transform to local timestamp and than to date
return cast(cast(to_timestamp_tz(i_date_string,'dy month dd hh24:mi:ss tzr yyyy') as TIMESTAMP WITH LOCAL TIME ZONE) as date);
EXCEPTION
WHEN OTHERS THEN NULL;
end;
begin
return to_date(i_date_string,'dy, month dd, yyyy hh:mi:ss am');
EXCEPTION
WHEN OTHERS THEN NULL;
end;
return NULL;
end;
/
例如样本数据
TSTMP
------------------------
20150101
2015-01-01 23:59:59
2015/01/01
2015-01-01
Thu Feb 02 20:49:59 Europe/Moscow 2012
Thu, Feb 02, 2012 10:49:59 AM
Thu, Feb 02, 2012 10:49:59 PM
你得到了
TSTMP RESULT_DATE
------------------------------------------ -------------------
20150101 01.01.2015 00:00:00
2015-01-01 23:59:59 01.01.2015 23:59:59
2015/01/01 01.01.2015 00:00:00
2015-01-01 01.01.2015 00:00:00
Thu Feb 02 20:49:59 Europe/Moscow 2012 02.02.2012 17:49:59
Thu, Feb 02, 2012 10:49:59 AM 02.02.2012 10:49:59
Thu, Feb 02, 2012 10:49:59 PM 02.02.2012 22:49:59
请注意,我跳过了时区abbraviation(MSK)的情况,请参阅@Sentinel答案中的可能解决方案,但请检查Conversion of String with Abbreviated Timezone to Timestamp这可能不明确。
答案 2 :(得分:0)
由于to_date
允许较小的偏差(不同的分隔符,缺少的分隔符,缺少的时间成分,2/4位数字的年份),因此可以稍微简化该功能。例如to_date(:str,'rrrr-mm-dd hh24:mi:ss')
将覆盖
2020/09/18 01.02
2020.09.18 01
20200918010203
2020-0901
202009-01
20/09/18
To_timestamp_tz
也可以容忍丢失的毫秒和时区(包括丢失的TZ元素TZM和TZD)。
因此,我们基本上只需要注意主要变化,例如hh24 / hh,mm / mon,元素顺序,ISO 8601(2020-01-01T01:02:03)中的“ T”分隔符,TZ标记(UTC偏移{{1 }} /地区名称TZH:TZM
)和星期几(DY)。