FROM_UTC_TIMESTAMP在将日期转换为EST时给出空指针异常

时间:2015-12-08 11:11:05

标签: hive hdfs hiveql

我正在尝试将Date字符串转换为Timezone EST / EDT 我有一张表如下:

CREATE EXTERNAL TABLE IF NOT EXISTS myschema.timeZoneCheck1 (timeCol String)
row format delimited fields terminated by ','
lines terminated by '\n' 
LOCATION '/users/TestTimeZone/'
tblproperties ("skip.header.line.count"="1");

和HSFS文件中的数据如下:

timeZoneDateCol 2015年12月8日 2015年7月6日

这是所需的格式:

如果日期是2015/12/08,我应该得到像2015120811(因为它属于美国东部时间)的价值

如果日期是2015/07/06,那么我应该得到像2015070610这样的价值(因为它属于美国东部时间)

使用hive函数进行了尝试:

Select timeCol,FROM_UTC_TIMESTAMP(timeCol, 'EST') from myschema.timeZoneCheck1;

这是错误的Stacktrace:

Error: java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row {"timecol":"2015/12/08"}
        at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:179)
        at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)
        at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:453)
        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
        at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:415)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671)
        at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row {"timecol":"2015/12/08"}
        at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:507)
        at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:170)
        ... 8 more
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Error evaluating Converting field timecol from UTC to timezone: 'EST'
        at org.apache.hadoop.hive.ql.exec.SelectOperator.processOp(SelectOperator.java:82)
        at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:815)
        at org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:95)
        at org.apache.hadoop.hive.ql.exec.MapOperator$MapOpCtx.forward(MapOperator.java:157)
        at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:497)
        ... 9 more
Caused by: java.lang.NullPointerException
        at org.apache.hadoop.hive.ql.udf.generic.GenericUDFFromUtcTimestamp.evaluate(GenericUDFFromUtcTimestamp.java:90)
        at org.apache.hadoop.hive.ql.exec.ExprNodeGenericFuncEvaluator._evaluate(ExprNodeGenericFuncEvaluator.java:185)
        at org.apache.hadoop.hive.ql.exec.ExprNodeEvaluator.evaluate(ExprNodeEvaluator.java:77)
        at org.apache.hadoop.hive.ql.exec.ExprNodeEvaluator.evaluate(ExprNodeEvaluator.java:65)
        at org.apache.hadoop.hive.ql.exec.SelectOperator.processOp(SelectOperator.java:77)
        ... 13 more


FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask
MapReduce Jobs Launched:
Stage-Stage-1: Map: 1   HDFS Read: 0 HDFS Write: 0 FAIL
Total MapReduce CPU Time Spent: 0 msec

1 个答案:

答案 0 :(得分:-1)

您是否仔细阅读了该功能的documentation?你曾经使用过二进制DATE和TIMESTAMP Hive数据类型吗?

好的,这些都是修辞问题。简而言之,Hive识别日期和时间文字(即包含日期/时间表示的字符串)仅使用ODBC格式

  • '2015-12-31'可以与DATE(隐式转化)进行比较
  • date'2015-12-31'更加明确,或多或少是一个SQL 标准
  • '2015-12-31 23:59:59'可以与TIMESTAMP进行比较(隐式 转化率)

我可以说ISO而不是ODBC,但实际上,ISO 8601在日期和时间之间使用“T”。 ODBC使用的空间更易于阅读,更受欢迎。

因此,您只需将您的古怪日期格式重新格式化为ISO / ODBC日期,粘贴虚拟“午夜”时间,然后瞧:

select OUTLANDISH_DATE,
  from_utc_timestamp(concat(regexp_replace(OUTLANDISH_DATE, '/', '-'), ' 00:00:00'), 'EST') as ZONED_TIMESTAMP
from WHATEVER

-- outlandish_date  zoned_timestamp
-- 2015/12/08       2015-12-07 19:00:00