如何为Airflow连接明确声明charset = utf8

时间:2017-09-06 21:39:14

标签: python mysql airflow apache-airflow

这个序列:

from airflow.hooks.mysql_hook import MySqlHook
conn = MySqlHook(mysql_conn_id='conn_id')
engine = conn.get_sqlalchemy_engine()
df.to_sql('test_table', engine, if_exists='append', index=False)

产生以下内容:

UnicodeEncodeError: 'latin-1' codec can't encode character '\ufffd' in position 57: ordinal not in range(256)

这个序列效果很好:

from sqlalchemy import create_engine
engine = create_engine("mysql://{0}:{1}@{2}/capone?charset=utf8".format(user, pwd, host))
df.to_sql('test_table', engine, if_exists='append', index=False)

关键在于明确声明charset。我尝试使用{"charset": "utf8"}

在气流中执行此操作

enter image description here

但这并未修复错误。自从进行更改后,我重新启动了我的开发环境,管理面板让我知道编辑成功了。如何使用utf8将气流连接到我的字符集?

3 个答案:

答案 0 :(得分:2)

我意识到这是Airflow中的错误,我在这里进行了报告:https://issues.apache.org/jira/browse/AIRFLOW-4824

目前,我可以使用以下代码解决此问题:

                        With newshp
                        .WrapFormat.Type = oldshp.WrapFormat.Type
                        .RelativeHorizontalPosition = oldshp.RelativeHorizontalPosition
                        .LeftRelative = oldshp.LeftRelative
                        .RelativeVerticalPosition = oldshp.RelativeVerticalPosition
                        .TopRelative = oldshp.TopRelative
                        .Top = oldshp.Top
                        .Left = oldshp.Left
                        .LockAnchor = oldshp.LockAnchor
                        End With

然后按如下所示使用它:

def get_uri(hook):
    conn = hook.get_connection(getattr(hook, hook.conn_name_attr))
    login = ''
    if conn.login:
        login = '{conn.login}:{conn.password}@'.format(conn=conn)
    host = conn.host
    if conn.port is not None:
        host += ':{port}'.format(port=conn.port)
    charset = ''
    if conn.extra_dejson.get('charset', False):
        chrs = conn.extra_dejson["charset"]
        if chrs.lower() == 'utf8' or chrs.lower() == 'utf-8':
            charset = '?charset=utf8'
    return '{conn.conn_type}://{login}{host}/{conn.schema}{charset}'.format(
        conn=conn, login=login, host=host, charset=charset)

真正的解决方案是向mysql_hook.py中覆盖get_uri的项目发送拉取请求。

答案 1 :(得分:0)

from sqlalchemy import create_engine
from airflow.hooks.mysql_hook import MySqlHook

conn = MySqlHook(mysql_conn_id='conn_id')
uri = conn.get_uri()
engine = create_engine(uri+'?charset=utf8')
df.to_sql('test_table', engine, if_exists='append', index=False)

我通过上面的代码解决了这个问题。

答案 2 :(得分:-1)

我通过方法解决了该问题并正常工作(在airflow.cfg文件中进行了编辑):

sql_alchemy_conn = mysql://user:password@host:port/airflow?charset=utf8