unix_timestamp对Hive的行为不正确

时间:2017-10-26 13:51:04

标签: hive unix-timestamp

我在单个HQL中有以下2个插入查询,我从表中插入了几个字段" Final_table_temp"到Final_table1& Final_table2。两个目标表都具有完全相同的结构。

insert into Final_table1
PARTITION(event_date,service_id)
select from_unixtime(unix_timestamp(event_timestamp ,'yyyy-MM-dd HH'), 'yyyy-MM-dd HH:00:00.0'),
from_unixtime(unix_timestamp(event_timestamp ,'yyyy-MM-dd HH'), 'yyyy-MM-dd') as event_date,
service_id 
from Final_table_temp;

insert into Final_table2
PARTITION(event_date,service_id)
select from_unixtime(unix_timestamp(event_timestamp ,'yyyy-MM-dd HH'), 'yyyy-MM-dd HH:00:00.0'),
from_unixtime(unix_timestamp(event_timestamp ,'yyyy-MM-dd HH'), 'yyyy-MM-dd') as event_date,
service_id 
from Final_table_temp;

Final_table_temp中的event_timestamp值:

2017-10-26 22

Final_table1中的event_timestamp值:

2017-10-26 22:00:00.000

Final_table2中的event_timestamp值:

2017-10-26 21:00:00.000

请帮助我理解为什么它会改变table2的值。它应该与table1相同,因为它们在查询和更新方面没有变化。来源也一样吗?

hive> desc extended Final_table1;
OK
event_timestamp         timestamp
event_date              date
service_id              int

# Partition Information
# col_name              data_type               comment

event_date              date
service_id              int

Detailed Table Information      Table(tableName:Final_table1, dbName:rwdb, owner:hdfs, createTime:1496391931, lastAccessTime:0, retention:0, sd:StorageDescriptor(cols:[FieldSchema(name:event_timestamp, type:timestamp, comment:null), FieldSchema(name:event_date, type:date, comment:null), FieldSchema(name:service_id, type:int, comment:null)], location:hdfs://R333:8020/user/hive/warehouse/rwdb.db/Final_table1, inputFormat:org.apache.hadoop.hive.ql.io.orc.OrcInputFormat, outputFormat:org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat, compressed:false, numBuckets:-1, serdeInfo:SerDeInfo(name:null, serializationLib:org.apache.hadoop.hive.ql.io.orc.OrcSerde, parameters:{field.delim=,, serialization.format=,}), bucketCols:[], sortCols:[], parameters:{}, skewedInfo:SkewedInfo(skewedColNames:[], skewedColValues:[], skewedColValueLocationMaps:{}), storedAsSubDirectories:false), partitionKeys:[FieldSchema(name:event_date, type:date, comment:null), FieldSchema(name:service_id, type:int, comment:null)], parameters:{transient_lastDdlTime=1496391931}, viewOriginalText:null, viewExpandedText:null, tableType:MANAGED_TABLE)

=======================================================
hive> desc extended Final_table2;
OK
event_timestamp         timestamp
event_date              date
service_id              int

# Partition Information
# col_name              data_type               comment

event_date              date
service_id              int

Detailed Table Information      Table(tableName:Final_table2, dbName:rwdb, owner:hdfs, createTime:1509000492, lastAccessTime:0, retention:0, sd:StorageDescriptor(cols:[FieldSchema(name:event_timestamp, type:timestamp, comment:null), FieldSchema(name:event_date, type:date, comment:null), FieldSchema(name:service_id, type:int, comment:null)], location:hdfs://R333:8020/user/hive/warehouse/rwdb.db/Final_table2, inputFormat:org.apache.hadoop.hive.ql.io.orc.OrcInputFormat, outputFormat:org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat, compressed:false, numBuckets:-1, serdeInfo:SerDeInfo(name:null, serializationLib:org.apache.hadoop.hive.ql.io.orc.OrcSerde, parameters:{field.delim=,, serialization.format=,}), bucketCols:[], sortCols:[], parameters:{}, skewedInfo:SkewedInfo(skewedColNames:[], skewedColValues:[], skewedColValueLocationMaps:{}), storedAsSubDirectories:false), partitionKeys:[FieldSchema(name:event_date, type:date, comment:null), FieldSchema(name:service_id, type:int, comment:null)], parameters:{transient_lastDdlTime=1509000492}, viewOriginalText:null, viewExpandedText:null, tableType:MANAGED_TABLE)

1 个答案:

答案 0 :(得分:0)

替换

  'yyyy-MM-dd HH:00:00.0'与'yyyy-MM-dd HH:mm:ss.S'