无法在字符串数据类型中使用unixtimestamp列类型

时间:2017-03-07 09:31:10

标签: json hive timestamp unix-timestamp

我有一个hive表来加载JSON数据。我的JSON中有两个值。两者都有数据类型为字符串。如果我将它们保留为bigint,则在此表中选择以下错误:

Failed with exception java.io.IOException:org.apache.hadoop.hive.serde2.SerDeException: org.codehaus.jackson.JsonParseException: Current token (VALUE_STRING) not numeric, can not use numeric value accessors
 at [Source: java.io.ByteArrayInputStream@3b6c740b; line: 1, column: 21]

如果我将其更改为两个字符串,那么它可以正常工作。

现在,因为这些列是字符串,所以我无法对这些列使用from_unixtime方法。

如果我尝试将这些列数据类型从string更改为bigint,我会收到以下错误:

FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. Unable to alter table. The following columns have types incompatible with the existing columns in their respective positions : uploadtimestamp

以下是我的create table语句:

create table ABC
(
    uploadTimeStamp bigint
   ,PDID            string

   ,data            array
                    <
                        struct
                        <
                            Data:struct
                            <
                                unit:string
                               ,value:string
                               ,heading:string
                               ,loc:string
                               ,loc1:string
                               ,loc2:string
                               ,loc3:string
                               ,speed:string
                               ,xvalue:string
                               ,yvalue:string
                               ,zvalue:string
                            >
                           ,Event:string
                           ,PDID:string
                           ,`Timestamp`:string
                           ,Timezone:string
                           ,Version:string
                           ,pii:struct<dummy:string>
                        >
                    >
)
row format serde 'org.apache.hive.hcatalog.data.JsonSerDe' 
stored as textfile;

我的JSON:

{"uploadTimeStamp":"1488793268598","PDID":"123","data":[{"Data":{"unit":"rpm","value":"100"},"EventID":"E1","PDID":"123","Timestamp":1488793268598,"Timezone":330,"Version":"1.0","pii":{}},{"Data":{"heading":"N","loc":"false","loc1":"16.032425","loc2":"80.770587","loc3":"false","speed":"10"},"EventID":"Location","PDID":"skga06031430gedvcl1pdid2367","Timestamp":1488793268598,"Timezone":330,"Version":"1.1","pii":{}},{"Data":{"xvalue":"1.1","yvalue":"1.2","zvalue":"2.2"},"EventID":"AccelerometerInfo","PDID":"skga06031430gedvcl1pdid2367","Timestamp":1488793268598,"Timezone":330,"Version":"1.0","pii":{}},{"EventID":"FuelLevel","Data":{"value":"50","unit":"percentage"},"Version":"1.0","Timestamp":1488793268598,"PDID":"skga06031430gedvcl1pdid2367","Timezone":330},{"Data":{"unit":"kmph","value":"70"},"EventID":"VehicleSpeed","PDID":"skga06031430gedvcl1pdid2367","Timestamp":1488793268598,"Timezone":330,"Version":"1.0","pii":{}}]}

我可以将这个字符串unixtimestamp转换为标准时间,或者我可以使用bigint来处理这些列吗?

1 个答案:

答案 0 :(得分:0)

  1. 如果您正在谈论时间戳时区,那么可以将它们定义为int / big int类型。
    如果您查看他们的定义,您会看到值周围没有限定符(“),因此它们是JSON文档中的数字类型:

    “时间戳”:1488793268598, “时区”:330

  2. create external table myjson
    (
        uploadTimeStamp string
       ,PDID            string
    
       ,data            array
                        <
                            struct
                            <
                                Data:struct
                                <
                                    unit:string
                                   ,value:string
                                   ,heading:string
                                   ,loc3:string
                                   ,loc:string
                                   ,loc1:string
                                   ,loc4:string
                                   ,speed:string
                                   ,x:string
                                   ,y:string
                                   ,z:string
                                >
                               ,EventID:string
                               ,PDID:string
                               ,`Timestamp`:bigint
                               ,Timezone:smallint
                               ,Version:string
                               ,pii:struct<dummy:string>
                            >
                        >
    )
    row format serde 'org.apache.hive.hcatalog.data.JsonSerDe' 
    stored as textfile
    location '/tmp/myjson'
    ;
    
    +------------------------+-------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
    | myjson.uploadtimestamp | myjson.pdid |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         myjson.data                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         |
    +------------------------+-------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
    |          1486631318873 |         123 | [{"data":{"unit":"rpm","value":"0","heading":null,"loc3":null,"loc":null,"loc1":null,"loc4":null,"speed":null,"x":null,"y":null,"z":null},"eventid":"E1","pdid":"123","timestamp":1486631318873,"timezone":330,"version":"1.0","pii":{"dummy":null}},{"data":{"unit":null,"value":null,"heading":"N","loc3":"false","loc":"14.022425","loc1":"78.760587","loc4":"false","speed":"10","x":null,"y":null,"z":null},"eventid":"E2","pdid":"123","timestamp":1486631318873,"timezone":330,"version":"1.1","pii":{"dummy":null}},{"data":{"unit":null,"value":null,"heading":null,"loc3":null,"loc":null,"loc1":null,"loc4":null,"speed":null,"x":"1.1","y":"1.2","z":"2.2"},"eventid":"E3","pdid":"123","timestamp":1486631318873,"timezone":330,"version":"1.0","pii":{"dummy":null}},{"data":{"unit":"percentage","value":"50","heading":null,"loc3":null,"loc":null,"loc1":null,"loc4":null,"speed":null,"x":null,"y":null,"z":null},"eventid":"E4","pdid":"123","timestamp":1486631318873,"timezone":330,"version":"1.0","pii":null},{"data":{"unit":"kmph","value":"70","heading":null,"loc3":null,"loc":null,"loc1":null,"loc4":null,"speed":null,"x":null,"y":null,"z":null},"eventid":"E5","pdid":"123","timestamp":1486631318873,"timezone":330,"version":"1.0","pii":{"dummy":null}}] |
    +------------------------+-------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
    
    1. 即使您已将时间戳定义为字符串,在将其用于需要bigint的函数中之前,仍然可以将其强制转换为bigint。

      施放(`Timestamp`为bigint)

    2. hive> with t as (select '0' as `timestamp`) select from_unixtime(`timestamp`) from t;
      
        

      FAILED:SemanticException [错误10014]:第1:45行错误的参数   'timestamp':没有类的匹配方法   org.apache.hadoop.hive.ql.udf.UDFFromUnixTime with(string)。可能   选项: FUNC (bigint) FUNC (bigint,string) FUNC (int)    FUNC (int,string)

      hive> with t as (select '0' as `timestamp`) select from_unixtime(cast(`timestamp` as bigint)) from t;
      OK
      1970-01-01 00:00:00