Redshift UDF中的scikit Learn软件包导入失败

时间:2019-06-27 09:09:55

标签: python scikit-learn amazon-redshift user-defined-functions

我正在尝试在Redshift中导入scikit-learn库,但出现错误: '不是目录','/rdsdbdata/user_lib/1/0/627380.zip/sklearn / __ check_build')

我遵循以下步骤:

  1. 已下载scikit-learn的0.18.1 / 0.18.2版本,因为此版本需要:Python(> = 2.6或> = 3.3),NumPy(> = 1.6.1),SciPy(> = 0.9)和redshift具有Python 2.7,numpy 1.8.2,scipy 0.12.1。

  2. 将sklearn zip文件上传到s3

  3. 在Redshift中创建的库:

CREATE or replace LIBRARY sklearn 
LANGUAGE plpythonu FROM 's3://<path>/sklearn.zip' CREDENTIALS 'aws_access_key_id=<access_key>;aws_secret_access_key=<secret_key>';
  1. 创建的函数:
create or replace function f_py_scikit ()
  returns varchar
stable
as $$
  import sklearn
  return 'sklearn: {}'.format(sklearn.__version__)
$$ language plpythonu;
select f_scikit();

[Amazon](500310) Invalid operation: OSError: (20, 'Not a directory', '/rdsdbdata/user_lib/0/0/627380.zip/sklearn/__check_build'). Please look at svl_udf_log for more information

svl_udf_log:

   query    message created traceback   funcname    node    slice   seq
0   OSError: (20, 'Not a directory', '/rdsdbdata/user_lib/0/0/627380.zip/sklearn/__check_build') 1000   20000   0

0 个答案:

没有答案