如何在使用pyspark shell时导入额外的python包

时间:2018-11-15 13:21:27

标签: python pyspark airflow

我需要在Pyspark Shell中导入气流库模块。登录到Pyspark Shell时,我在--py-files中包含了模块路径。

pyspark2 --py-files /nas/isg_prodops_work/ABO/abound/prod/anaconda/envs/nas_airflow/lib/python3.5/site-packages/airflow

但是,仍然出现以下错误:

>>> from airflow.models import Variable
    Traceback (most recent call last):
    File "<stdin>", line 1, in <module>
    ImportError: No module named airflow.models 

我的模块的目录结构如下:

airflow
|-- __init__.py
|-- dag(directory)
|-- operators(directory)
|-- models.py 

1 个答案:

答案 0 :(得分:0)

尝试以下命令:

pyspark2 --py-files /nas/isg_prodops_work/ABO/abound/prod/anaconda/envs/nas_airflow/lib/python3.5/site-packages/airflow/models.py

然后像这样导入:

>>> from models import Variable