我正在使用Python软件包impyla
以编程方式连接到Hive;我没有使用hive
CLI。而且我正在尝试使用用Python编写的UDF。
我看过的所有教程都这样做
ADD FILE myscript.py;
...
SELECT TRANSFORM (cols...)
USING 'python myscript.py'
AS ...
我认为USING
部分可以是任何执行正确操作的可执行程序。因此,我想我可以像这样以字符串形式实时发送脚本
USING 'python -c "import sys; ..."'
这样可以很好地避免文件传输到Hadoop。但是,我很难使它正常工作。
有用的代码失效后,我简化为这个伪代码
USING 'python -c "print 3"
仅用于调试。我收到的错误是
E impala.error.HiveServer2Error: Error while compiling statement: FAILED: IllegalArgumentException Can not create a Path from an empty string
更多细节是
test_hive_udf_example.py:77:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
../../src/sunny/sql/hive.py:80: in read
return super().read(sql, configuration=config)
../../src/sunny/sql/sql.py:136: in read
self._execute(sql, **kwargs)
../../src/sunny/sql/sql.py:130: in _execute
self._cursor.execute(sql, **kwargs)
/usr/local/lib/python3.6/dist-packages/impala/hiveserver2.py:302: in execute
configuration=configuration)
/usr/local/lib/python3.6/dist-packages/impala/hiveserver2.py:343: in execute_async
self._execute_async(op)
/usr/local/lib/python3.6/dist-packages/impala/hiveserver2.py:362: in _execute_async
operation_fn()
/usr/local/lib/python3.6/dist-packages/impala/hiveserver2.py:340: in op
async=True)
/usr/local/lib/python3.6/dist-packages/impala/hiveserver2.py:1027: in execute
return self._operation('ExecuteStatement', req)
/usr/local/lib/python3.6/dist-packages/impala/hiveserver2.py:957: in _operation
resp = self._rpc(kind, request)
/usr/local/lib/python3.6/dist-packages/impala/hiveserver2.py:925: in _rpc
err_if_rpc_not_ok(response)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
resp = TExecuteStatementResp(status=TStatus(statusCode=3, infoMessages=None, sqlState='42000', errorCode=40000, errorMessage=...mpiling statement: FAILED: IllegalArgumentException Can not create a Path from an empty string'), operationHandle=None)
def err_if_rpc_not_ok(resp):
if (resp.status.statusCode != TStatusCode.SUCCESS_STATUS and
resp.status.statusCode != TStatusCode.SUCCESS_WITH_INFO_STATUS and
resp.status.statusCode != TStatusCode.STILL_EXECUTING_STATUS):
> raise HiveServer2Error(resp.status.errorMessage)
E impala.error.HiveServer2Error: Error while compiling statement: FAILED: IllegalArgumentException Can not create a Path from an empty string
似乎抱怨的是它正在寻找脚本,而我的代码根本没有提及。
我是通过hive
CLI而不是impyla
来执行此脚本的,因此我确认语法不需要使用脚本。 USING 'python -c "..."'
可以起作用。
现在问题似乎出在我如何通过impyla
使用它。
欢迎任何指针!谢谢!