我在Hive中使用source命令来运行包含许多Hive UDFS的外部文件(纯SQL,如日期转换)。外部文件对于许多脚本是通用的;因此,更容易在个别脚本之外进行维护。
所以,如果我有
source /tmp/udfs.hql;
select * from tmp1
从命令行运行,即
hive -e "......."
它工作正常。
当然,如果我尝试在Oozie或非CLI客户端中执行此操作,则会失败,因为source是CLI命令。
现在的问题是:如何在CLI之外复制此功能?换句话说,如何在配置单元查询中执行 source 命令?
答案 0 :(得分:1)
笨拙的解决方法:
shell的代码示例:
typeset CurrentJobInfo CurrentJobId TargetHiveScript
if [[ "$CONTAINER_ID" != "" && "$OOZIE_ACTION_CONF_XML" != "" ]]
then
CurrentJobInfo=$(/bin/sed -n '/<name>mapreduce.job.name<\/name>/ { N ; s/^.*<value>oozie:action:/:/ ; s/<\/value>.*$/:/ ; p}' "$OOZIE_ACTION_CONF_XML")
CurrentJobId=$(/bin/echo "$CurrentJobInfo" | /bin/sed -n '/:ID=[^:]*:/ { s/^.*ID=// ; s/:.*$// ; p }')
fi
if [[ "$CurrentJobId" == "" ]]
then
/bin/echo "ERROR - could not find Oozie Job ID in expected XML config file" 1>&2
exit 255
fi
TargetHiveScript="/user/johndoe/temp/${CurrentJobId}-DummyHiveAction.hql"
# all these ".hql" scripts are assumed to be available in the CWD thanks to <file> elements in Oozie Shell Action
/bin/cat common.hql common.DummyApp.hql DummyHiveAction.hql | /usr/bin/hdfs dfs -put -f - "$TargetHiveScript"
if [[ $? -ne 0 ]]
then
/bin/echo "ERROR - could not upload Hive script" 1>&2
exit 255
fi
exit 0
在Hive Action中,对该文件的引用应为
<script>/user/johndoe/temp/${wf:id()}-DummyHiveAction.hql<script>
PS:我没有测试所有端到端,只是在我们网站上运行的代码中进行了一些复制/粘贴/编辑。调试全是你的: - )