我想将以下格式'03/21/2006 10:00:00 PM'
的日期转换为Hive中的TIMESTAMP数据类型。字符串来自芝加哥的时区,这意味着适用夏令时。我不希望在此过程中将时间转换为UTC。
我尝试过以下操作,但unix_timestamp()
无法理解时区America/Chicago
。
SELECT cast(
from_unixtime(
unix_timestamp('03/21/2006 10:00:00 PM America/Chicago', 'MM/dd/yyyy hh:mm:ss a zzzz')
) AS TIMESTAMP
);
我可以使用CDT
或CST
作为时区,具体取决于日期是否在dailight节省时间内,但由于时间变化日期因年而异,因此会变得混乱。
有更好的方法吗?
我使用的是Hive 1.1.0。
感谢您的时间。
修改
我已经能够使用以下代码实现我的目标。 解决方案既不简洁也不优雅,但它现在可以完成工作。 UDF可以以更模块化的方式完成相同的工作,但我还没有编写Hive UDF的经验。
,from_utc_timestamp(to_utc_timestamp(concat(
regexp_extract(date, '(\\d{2})/(\\d{2})/(\\d{4}) (\\d{2})(:\\d{2}:\\d{2}) ([A|P]M)', 3) -- year
,'-'
,regexp_extract(date, '(\\d{2})/(\\d{2})/(\\d{4}) (\\d{2})(:\\d{2}:\\d{2}) ([A|P]M)', 1) -- month
,'-'
,regexp_extract(date, '(\\d{2})/(\\d{2})/(\\d{4}) (\\d{2})(:\\d{2}:\\d{2}) ([A|P]M)', 2) -- day
,' '
,CASE -- hour
WHEN regexp_extract(date, '(\\d{2})/(\\d{2})/(\\d{4}) (\\d{2})(:\\d{2}:\\d{2}) ([A|P]M)', 6) = 'AM'
AND regexp_extract(date, '(\\d{2})/(\\d{2})/(\\d{4}) (\\d{2})(:\\d{2}:\\d{2}) ([A|P]M)', 4) = '12'
THEN '00'
WHEN regexp_extract(date, '(\\d{2})/(\\d{2})/(\\d{4}) (\\d{2})(:\\d{2}:\\d{2}) ([A|P]M)', 6) = 'AM'
AND regexp_extract(date, '(\\d{2})/(\\d{2})/(\\d{4}) (\\d{2})(:\\d{2}:\\d{2}) ([A|P]M)', 4) <> '12'
THEN regexp_extract(date, '(\\d{2})/(\\d{2})/(\\d{4}) (\\d{2})(:\\d{2}:\\d{2}) ([A|P]M)', 4)
WHEN regexp_extract(date, '(\\d{2})/(\\d{2})/(\\d{4}) (\\d{2})(:\\d{2}:\\d{2}) ([A|P]M)', 6) = 'PM'
AND regexp_extract(date, '(\\d{2})/(\\d{2})/(\\d{4}) (\\d{2})(:\\d{2}:\\d{2}) ([A|P]M)', 4) = '12'
THEN '12'
WHEN regexp_extract(date, '(\\d{2})/(\\d{2})/(\\d{4}) (\\d{2})(:\\d{2}:\\d{2}) ([A|P]M)', 6) = 'PM'
AND regexp_extract(date, '(\\d{2})/(\\d{2})/(\\d{4}) (\\d{2})(:\\d{2}:\\d{2}) ([A|P]M)', 4) <> '12'
THEN cast(regexp_extract(date, '(\\d{2})/(\\d{2})/(\\d{4}) (\\d{2})(:\\d{2}:\\d{2}) ([A|P]M)', 4) AS TINYINT) + 12
ELSE NULL -- should never get here
END
,regexp_extract(date, '(\\d{2})/(\\d{2})/(\\d{4}) (\\d{2})(:\\d{2}:\\d{2}) ([A|P]M)', 5) -- rest of time
)
, 'America/Chicago')
, 'America/Chicago') AS cast_date