我在Cloudera-Training-VM-4.1.1.c上使用hive 0.9.0版本。
我在SELECT
查询中使用hive变量时遇到问题。请看看,让我知道我哪里出错。
1)这是我的名为rushi_target
hive> desc rushi_target;
OK
load_ts timestamp
id int
name string
loc string
data_dt string
Time taken: 0.154 seconds
2)这是数据
hive> select * from rushi_target;
OK
2015-12-01 00:02:34 1 rushi pune 2015-12-01
2015-12-02 04:02:34 2 komal pune 2015-12-02
2015-12-03 00:03:34 3 bhanu bangalore 2015-12-03
2015-12-04 00:03:34 4 sachin pune 2015-12-04
Time taken: 0.258 seconds
3)设置hive变量
hive> set maxtimevar= select max(load_ts) from rushi_target;
4)显示hive变量的值
hive> ${hiveconf:maxtimevar};
Total MapReduce jobs = 1
Launching Job 1 out of 1
Number of reduce tasks determined at compile time: 1
In order to change the average load for a reducer (in bytes):
set hive.exec.reducers.bytes.per.reducer=<number>
In order to limit the maximum number of reducers:
set hive.exec.reducers.max=<number>
In order to set a constant number of reducers:
set mapred.reduce.tasks=<number>
Starting Job = job_201512211113_0011, Tracking URL = http://0.0.0.0:50030/jobdetails.jsp?jobid=job_201512211113_0011
Kill Command = /usr/lib/hadoop/bin/hadoop job -Dmapred.job.tracker=0.0.0.0:8021 -kill job_201512211113_0011
Hadoop job information for Stage-1: number of mappers: 1; number of reducers: 1
2015-12-22 08:51:35,401 Stage-1 map = 0%, reduce = 0%
2015-12-22 08:51:37,409 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 0.83 sec
2015-12-22 08:51:38,416 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 0.83 sec
2015-12-22 08:51:39,423 Stage-1 map = 100%, reduce = 100%, Cumulative CPU 1.43 sec
2015-12-22 08:51:40,429 Stage-1 map = 100%, reduce = 100%, Cumulative CPU 1.43 sec
2015-12-22 08:51:41,439 Stage-1 map = 100%, reduce = 100%, Cumulative CPU 1.43 sec
MapReduce Total cumulative CPU time: 1 seconds 430 msec
Ended Job = job_201512211113_0011
MapReduce Jobs Launched:
Job 0: Map: 1 Reduce: 1 Cumulative CPU: 1.43 sec HDFS Read: 0 HDFS Write: 0 SUCCESS
Total MapReduce CPU Time Spent: 1 seconds 430 msec
OK
2015-12-04 00:03:34
Time taken: 8.632 seconds
5)为什么这不起作用?
hive> select * from rushi_target where load_ts= ${hiveconf:maxtimevar};
FAILED: ParseException line 1:42 cannot recognize input near 'select' 'max' '(' in expression specification
6)这不会抛出任何错误但也没有输出:
hive> select * from rushi_target where load_ts = '${hiveconf:maxtimevar}';
Total MapReduce jobs = 1
Launching Job 1 out of 1
Number of reduce tasks is set to 0 since there's no reduce operator
Starting Job = job_201512211113_0025, Tracking URL = http://0.0.0.0:50030/jobdetails.jsp?jobid=job_201512211113_0025
Kill Command = /usr/lib/hadoop/bin/hadoop job -Dmapred.job.tracker=0.0.0.0:8021 -kill job_201512211113_0025
Hadoop job information for Stage-1: number of mappers: 1; number of reducers: 0
2015-12-22 10:16:56,666 Stage-1 map = 0%, reduce = 0%
2015-12-22 10:16:58,675 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 0.39 sec
2015-12-22 10:16:59,686 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 0.39 sec
2015-12-22 10:17:00,696 Stage-1 map = 100%, reduce = 100%, Cumulative CPU 0.39 sec
MapReduce Total cumulative CPU time: 390 msec
Ended Job = job_201512211113_0025
MapReduce Jobs Launched:
Job 0: Map: 1 Cumulative CPU: 0.39 sec HDFS Read: 0 HDFS Write: 0 SUCCESS
Total MapReduce CPU Time Spent: 390 msec
OK
Time taken: 6.568 seconds
hive>
我想在maxtimevar
查询中使用hive变量SELECT
。请建议如何使用。
答案 0 :(得分:0)
Hive变量用法是传递值而不是查询。
您的查询select * from rushi_target where load_ts = '${hiveconf:maxtimevar}';
正在评估select * from rushi_target where load_ts = 'select max(load_ts) from rushi_target'
。它正在查找带有load_ts作为查询的数据,显然没有带有这些值的数据。
这一项从5开始,select * from rushi_target where load_ts= ${hiveconf:maxtimevar};
将评估为select * from rushi_target where load_ts= select max(load_ts) from rushi_target;
。这不是有效的查询,因此它失败并出现语法错误。
您需要像这样编写查询。
select rt.* from rushi_target rt
join
(select max(load_ts) max_load_ts from rushi_target) rt1
on 1=1
where rt.load_ts = rt1.max_load_ts;