Question

我正在阅读公司其他团队的一些Hive脚本，并且无法理解其中的特定部分。有问题的部分是：where dt='${product_dt}'，可以在下面代码块底部的第三行找到。

我之前从未见过这种语法，也无法通过Google搜索找到任何内容（可能是因为我不知道要使用的正确搜索字词）。任何洞察where行过滤器步骤的内容都将受到赞赏。

set hive.security.authorization.enabled=false;
add jar /opt/mobiletl/prod_workflow_dir/lib/hiveudf_hash.jar;
create temporary function hash_string as 'HashString';

drop table 00_truthset_product_email_uid_pid;
create table 00_truthset_product_email_uid_pid as
select distinct email,        
       concat_ws('|', hash_string(lower(email), "SHA-1"),
                      hash_string(lower(email), "MD5"),
                      hash_string(upper(email), "SHA-1"),
                      hash_string(upper(email), "MD5")) as hashed_email,
       uid, address_id, confidencescore
from product.prod_vintages
where dt='${product_dt}'
      and email is not null and email != ''
      and address_id is not null and address_id != '';

我尝试了set product_dt = 2014-12;，但它似乎不起作用：

hive> SELECT dt FROM enabilink.prod_vintages GROUP BY dt LIMIT 10;
. . .
dt
2014-12
2015-01
2015-02
2015-03
2015-05
2015-07
2015-10
2016-01
2016-02
2016-03

hive> set product_dt = 2014-12;

hive> SELECT email FROM product.prod_vintages WHERE dt='${product_dt}';
. . .
Total MapReduce CPU Time Spent: 2 seconds 570 msec
OK
email
Time taken: 25.801 seconds

Answer 1

这些是在Hive中设置的变量。如果您在查询之前设置了变量（在同一会话中），Hive将使用指定值

替换它

例如

set product_dt=03-11-2012

编辑确保删除dt字段中的空格（使用trim UDF）。另外，设置不带空格的变量。

Hive语法：花括号和美元符号的目的

1 个答案: