我使用Postgres后端运行气流-一切正常。另外,我在运行Airflow的同一主机上运行了Jupyter服务器。现在我想我可以从笔记本电脑上取下气流挂钩。
import pandas as pd
import numpy as np
import matplotlib as plt
from airflow.hooks.mysql_hook import MySqlHook
mysql = MySqlHook(mysql_conn_id = 'mysql-x')
sql = "select 1+1"
mysql.get_pandas_df(sql)
但我收到此异常消息:
OperationalError: (sqlite3.OperationalError) no such table: connection
[SQL: SELECT connection.password AS connection_password, connection.extra AS connection_extra, connection.id AS connection_id, connection.conn_id AS connection_conn_id, connection.conn_type AS connection_conn_type, connection.host AS connection_host, connection.schema AS connection_schema, connection.login AS connection_login, connection.port AS connection_port, connection.is_encrypted AS connection_is_encrypted, connection.is_extra_encrypted AS connection_is_extra_encrypted
FROM connection
WHERE connection.conn_id = ?]
[parameters: ('mysql-x',)]
(Background on this error at: http://sqlalche.me/e/e3q8)
但是让我感到怀疑的是,它不仅没有找到connection_id(我在Airflow UI中显然可以看到它)。它还说:sqlite3.OperationalError-非常看起来它甚至没有连接到相同的postgres数据库。我检查了os.environ["AIRFLOW_HOME"]
似乎是正确的。
编辑1:
气流流通后启动jupyter笔记本服务器后,设置了所有环境变量,我得到了另一个错误:
/usr/local/lib/python3.7/site-packages/sqlalchemy/orm/attributes.py in __get__(self, instance, owner)
351
352 def __get__(self, instance, owner):
--> 353 retval = self.descriptor.__get__(instance, owner)
354 # detect if this is a plain Python @property, which just returns
355 # itself for class level access. If so, then return us.
/usr/local/lib/python3.7/site-packages/airflow/models/connection.py in get_password(self)
188 "Can't decrypt encrypted password for login={}, \
189 FERNET_KEY configuration is missing".format(self.login))
--> 190 return fernet.decrypt(bytes(self._password, 'utf-8')).decode()
191 else:
192 return self._password
/usr/local/lib/python3.7/site-packages/cryptography/fernet.py in decrypt(self, msg, ttl)
169 except InvalidToken:
170 pass
--> 171 raise InvalidToken
InvalidToken:
您可以使用此dockerfile进行复制:
FROM apache/airflow
USER root
# install mysql client
RUN apt-get update && apt-get install -y mariadb-client-10.3 unzip
# install mssql client and tools
RUN apt-get install -y curl gnupg libicu-dev libicu63
RUN curl https://packages.microsoft.com/keys/microsoft.asc -o key
RUN apt-key add < key
RUN curl https://packages.microsoft.com/config/ubuntu/16.04/prod.list > /etc/apt/sources.list.d/msprod.list
ENV ACCEPT_EULA=Y
RUN apt-get update && apt-get install -y mssql-tools msodbcsql17 unixodbc unixodbc-dev unzip libunwind8
RUN curl -Lq 'https://go.microsoft.com/fwlink/?linkid=2108814' -o sqlpackage-linux-x64-latest.zip
RUN mkdir /opt/sqlpackage/ && unzip sqlpackage-linux-x64-latest.zip -d /opt/sqlpackage/
RUN chmod a+x /opt/sqlpackage/sqlpackage && ln -sfn /opt/sqlpackage/sqlpackage /usr/bin/sqlpackage
RUN ln -sfn /opt/mssql-tools/bin/sqlcmd /usr/bin/sqlcmd
# install notebooks
RUN pip install jupyterlab pandas numpy scikit-learn matplotlib pymssql
#RUN cat /entrypoint.sh
# start additional notebok server
# this is a dirty hack but for the sake of this prototype good enough
RUN sed -i -e's/\# Run the command/airflow scheduler \& \njupyter notebook --ip=0.0.0.0 --port=9000 --NotebookApp.token="" --NotebookApp.password="" \& \n/' /entrypoint
EXPOSE 9000
# switch back to airflow user
USER airflow
RUN airflow initdb
RUN alias ll='ls -al'