Airflow 1.9 ERROR - 'HiveCliHook'对象没有属性'upper'

时间:2018-04-09 14:05:37

标签: python-3.x hive airflow

我正在尝试在Airflow 1.9上运行hive_operator。

  • 在Python(IntelliJ环境)代码编译时没有任何错误
  • 已配置连接

代码是:

import airflow
from airflow.operators.hive_operator import HiveOperator
from airflow.hooks.hive_hooks import HiveCliHook
from airflow.models import DAG
from datetime import timedelta

default_args = {
'owner': 'airflow',
'depends_on_past': False,
'start_date': airflow.utils.dates.days_ago(2),
'email': ['support@mail.com'],
'email_on_failure': True,
'retries': 2,
'retry_delay': timedelta(seconds=30),
'catchup': False,
}


HiveCli_hook = HiveCliHook(hive_cli_conn_id='hive_cli_default')
hql = 'INSERT INTO test.test_table SELECT DISTINCT id FROM 
test.tabl_test;'

dag = DAG(
dag_id='Hive_in_action',
default_args=default_args,
schedule_interval='0 0 * * *',
dagrun_timeout=timedelta(minutes=60))

create_test_table = HiveOperator(
task_id="create_test_table",
hql=hql,
hive_cli_conn_id=HiveCli_hook,
dag=dag
)

我使用隧道,这就是localhost

的原因

Connection settings

我收到错误:

错误 - 'HiveCliHook'对象没有属性'upper'

记录的最大部分:

[2018-04-09 16:40:14,672] {models.py:1428} INFO - Executing Task(HiveOperator): create_test_table> on 2018-04-09 14:39:08
[2018-04-09 16:40:14,672] {base_task_runner.py:115} INFO - Running: ['bash', '-c', 'airflow run Hive_in_action create_test_table 2018-04-09T14:39:08 --job_id 19 --raw -sd DAGS_FOLDER/Hive_in_action.py']
[2018-04-09 16:40:15,283] {base_task_runner.py:98} INFO - Subtask: [2018-04-09 16:40:15,282] {__init__.py:45} INFO - Using executor SequentialExecutor
[2018-04-09 16:40:15,361] {base_task_runner.py:98} INFO - Subtask: [2018-04-09 16:40:15,360] {models.py:189} INFO - Filling up the DagBag from /Users/mypc/airflow/dags/Hive_in_action.py
[2018-04-09 16:40:15,387] {base_task_runner.py:98} INFO - Subtask: [2018-04-09 16:40:15,387] {base_hook.py:80} INFO - Using connection to: localhost
[2018-04-09 16:40:15,400] {cli.py:374} INFO - Running on host MyPC.local
[2018-04-09 16:40:15,413] {base_task_runner.py:98} INFO - Subtask: [2018-04-09 16:40:15,412] {hive_operator.py:96} INFO - Executing: INSERT INTO test.test_table SELECT DISTINCT id FROM test.tabl_test;
[2018-04-09 16:40:15,412] {models.py:1595} ERROR - 'HiveCliHook' object has no attribute 'upper'
Traceback (most recent call last):
  File "/Users/mypc/anaconda/lib/python3.6/site-packages/airflow/models.py", line 1493, in _run_raw_task
    result = task_copy.execute(context=context)
  File "/Users/mypc/anaconda/lib/python3.6/site-packages/airflow/operators/hive_operator.py", line 97, in execute
    self.hook = self.get_hook()
  File "/Users/mypc/anaconda/lib/python3.6/site-packages/airflow/operators/hive_operator.py", line 86, in get_hook
    mapred_job_name=self.mapred_job_name)
  File "/Users/mypc/anaconda/lib/python3.6/site-packages/airflow/hooks/hive_hooks.py", line 71, in __init__
    conn = self.get_connection(hive_cli_conn_id)
  File "/Users/mypc/anaconda/lib/python3.6/site-packages/airflow/hooks/base_hook.py", line 77, in get_connection
    conn = random.choice(cls.get_connections(conn_id))
  File "/Users/mypc/anaconda/lib/python3.6/site-packages/airflow/hooks/base_hook.py", line 68, in get_connections
    conn = cls._get_connection_from_env(conn_id)
  File "/Users/mypc/anaconda/lib/python3.6/site-packages/airflow/hooks/base_hook.py", line 60, in _get_connection_from_env
    environment_uri = os.environ.get(CONN_ENV_PREFIX + conn_id.upper())
AttributeError: 'HiveCliHook' object has no attribute 'upper'
[2018-04-09 16:40:15,416] {models.py:1622} INFO - All retries failed; marking task as FAILED

2 个答案:

答案 0 :(得分:0)

您不应指定与该类名称相同的变量或对象:

HiveCliHook = HiveCliHook(...)

而是使用其他名称:

myHook = HiveCliHook(...)

create_test_table = HiveOperator(
...
hive_cli_conn_id=myHook,
...)

答案 1 :(得分:0)

好像您正在将HiveCliHook对象作为http_conn_id传递。我对HiveOperator进行了映像,并使用upper()函数将期望的字符串转换为大写,因此,行hive_cli_conn_id=HiveCli_hook,导致了该错误。