Apache PIG - 从TimeStamp获取唯一日期

时间:2016-07-25 12:54:43

标签: date datetime apache-pig converter

我有以下代码:

Data = load '/user/cloudera/' using PigStorage('\t') 
as
(   ID:chararray, 
    Time_Interval:chararray, 
    Code:chararray); 

transf = foreach Source_Data generate  (int) ID, 
                                   ToString( ToDate((long) Time_Interval), 'yyyy-MM-dd hh:ss:mm') as TimeStamp,
                        (int) Code; 

SPLIT transf INTO       Src25 IF (ToString(TimeStamp, 'yyyy-MM-dd')=='2016-07-25'),
                        Src26 IF (ToString(TimeStamp, 'yyyy-MM-dd')=='2016-07-26');


STORE Src25 INTO '/user/cloudera/2016-07-25' using PigStorage('\t');
STORE Src26 INTO '/user/cloudera/2016-07-26' using PigStorage('\t');

我想按日期和我在Split语句中放置的规则拆分文件,这给了我错误...

如何在Date中转换TimeStamp(用于transf语句)来进行比较?

非常感谢!

1 个答案:

答案 0 :(得分:1)

从ToDate获取datetime对象后,在datetime对象上使用GetYear(),GetMonth(),GetDay()并使用CONCAT仅构造日期。

transf = foreach Source_Data generate  
                   (int) ID, 
                   ToString( ToDate((long) Time_Interval), 'yyyy-MM-dd hh:ss:mm') as TimeStamp,
                   (int) Code;

transf_new = foreach transf generate
                     ID,
                     TimeStamp,
                     CONCAT(CONCAT(CONCAT(GetYear(TimeStamp),'-')),(CONCAT(GetMonth(TimeStamp),'-')),GetDay(TimeStamp)) AS Day,-- Note:Brackets might be slightly off but it should be like 'yyyy-MM-dd' format
                     Code;

-- Now use the new Day column to split the data
SPLIT transf_new INTO       Src25 IF (Day =='2016-07-25'),
                            Src26 IF (Day =='2016-07-26');