我有一个按年,月,日和小时划分的蜂巢表。我需要针对它运行查询以获取最近7天的数据。这是Hive 0.14.0.2.2.4.2-2
。我的查询目前看起来像这样:
SELECT COUNT(column_name) from table_name
where year >= year(date_sub(from_unixtime(unix_timestamp()), 7))
AND month >= month(date_sub(from_unixtime(unix_timestamp()), 7))
AND day >= day(date_sub(from_unixtime(unix_timestamp()), 7));
这需要很长时间。当我用上面的实际数字代替时,请说:
SELECT COUNT(column_name) from table_name
where year >= 2017
AND month >= 2
AND day >= 13
它在几分钟内结束。有没有办法改变上面的脚本,这实际上只包括查询中的数字而不是函数?
我尝试使用set
,如:
set yearLimit = year(date_sub(from_unixtime(unix_timestamp()), 7));
SELECT COUNT(column_name) from table_name
where year >= ${hiveconf:yearLimit}
AND month >= month(date_sub(from_unixtime(unix_timestamp()), 7))
AND day >= day(date_sub(from_unixtime(unix_timestamp()), 7));
但这并没有解决问题。