需要Hive查询来解析Json文件

时间:2015-07-02 09:39:32

标签: json hive

下面提到了示例JSON文件,我需要版本,可覆盖,场景,repairType,rank和notificationType

请在不添加任何新jar的情况下建议配置单元查询

 {
    "channelOutcome": {
        "MG": {
            "repairStrategies": [
                {
                    "scenario": "1",
                    "repairType": "ISR",
                    "rank": 1,
                    "notificationType": "Z5"
                },
                {
                    "scenario": "1",
                    "repairType": "SER",
                    "rank": 2,
                    "notificationType": "NO"
                },
                {
                    "scenario": "1",
                    "repairType": "ACC",
                    "rank": 3,
                    "notificationType": "Z5"
                },
                {
                    "scenario": "1",
                    "repairType": "SWP",
                    "rank": 4,
                    "notificationType": "Z5"
                },
                {
                    "scenario": "4",
                    "repairType": "RMS",
                    "rank": 5,
                    "notificationType": "Z8"
                }
            ],
            "overrideable": false
        }
    },
    "keyValues": [],
    "version": 2.3
    }

1 个答案:

答案 0 :(得分:0)

查询:不使用外部jar文件:)

Select
  three.version,
  three.overrideable,
  get_json_object(three.strategy,'$.scenario') as scenario,
  get_json_object(three.strategy,'$.repairType') as repairType,
  get_json_object(three.strategy,'$.rank') as rank ,
  get_json_object(three.strategy,'$.notificationType') as notificationType
FROM
(
 select s.version,s.overrideable,strategy
 FROM
 (
  select two.version as version,
         two.overrideable as overrideable ,
         split(two.repairStrategies,"\\|") as rs_array
  FROM
  (
    select one.version,
           one.overrideable as overrideable,
           regexp_replace(regexp_replace(one.repairStrategies,'\\[|\\]',''),'\\}\\,\\{','\\}\\|\\{') as repairStrategies
    FROM (
          Select get_json_object(helper_json.line,'$.version') as version,
                 get_json_object(helper_json.line,'$.channelOutcome.MG.overrideable') as overrideable ,
                 get_json_object(helper_json.line,'$.channelOutcome.MG.repairStrategies') as repairStrategies
          FROM helper_json
    )one
  ) two
 ) s LATERAL VIEW explode(s.rs_array) s AS strategy
) three;

其中 helper_json 具有以下架构。

hive (vijay)> describe helper_json;
OK
line                    string                  None
Time taken: 0.056 seconds, Fetched: 1 row(s)
hive (vijay)> select * from helper_json;
OK
{"channelOutcome":{"MG":{"repairStrategies":[{"scenario":"1","repairType":"ISR","rank":1,"notificationType":"Z5"},{"scenario":"1","repairType":"SER","rank":2,"notificationType":"NO"},{"scenario":"1","repairType":"ACC","rank":3,"notificationType":"Z5"},{"scenario":"1","repairType":"SWP","rank":4,"notificationType":"Z5"},{"scenario":"4","repairType":"RMS","rank":5,"notificationType":"Z8"}],"overrideable":false}},"keyValues":[],"version":2.3}
Time taken: 0.144 seconds, Fetched: 1 row(s)
hive (vijay)>

输出:添加了输出,以便更好地了解输出的内容。

Total MapReduce jobs = 1
Launching Job 1 out of 1
Number of reduce tasks is set to 0 since there's no reduce operator
Starting Job = job_201503240233_5513, Tracking URL = http://dragon1:50030/jobdetails.jsp?jobid=job_201503250213_4613
Kill Command = /usr/lib/hadoop/bin/hadoop job  -kill job_201503240233_5513
Hadoop job information for Stage-1: number of mappers: 1; number of reducers: 0
2015-07-04 05:06:51,144 Stage-1 map = 0%,  reduce = 0%
2015-07-04 05:06:56,178 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU 1.5 sec
2015-07-04 05:06:57,184 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU 1.5 sec
2015-07-04 05:06:58,191 Stage-1 map = 100%,  reduce = 100%, Cumulative CPU 1.5 sec
MapReduce Total cumulative CPU time: 1 seconds 500 msec
Ended Job = job_201503250213_4613
MapReduce Jobs Launched:
Job 0: Map: 1   Cumulative CPU: 1.5 sec   HDFS Read: 667 HDFS Write: 105 SUCCESS
Total MapReduce CPU Time Spent: 1 seconds 500 msec
OK
version overrideable    scenario        repairtype      rank    notificationtype
2.3     false   1       ISR     1       Z5
2.3     false   1       SER     2       NO
2.3     false   1       ACC     3       Z5
2.3     false   1       SWP     4       Z5
2.3     false   4       RMS     5       Z8
Time taken: 15.831 seconds, Fetched: 5 row(s)