apache airflow initdb在mysql的kubernetes_resource_checkingpoint失败

时间:2019-07-11 11:21:53

标签: mysql airflow mysql-python

我想使用MySQL作为Apache气流的后端数据库 在我运行

后安装依赖项之后
airflow initdb

Airflow开始设置数据库,但随后失败,并显示以下日志

shahbaz@OpenSource:~$ airflow initdb
[2019-07-11 12:01:13,726] {settings.py:182} INFO - 
settings.configure_orm(): Using pool settings. pool_size=5, 
pool_recycle=1800, pid=17492
[2019-07-11 12:01:13,917] {__init__.py:51} INFO - Using executor 
LocalExecutor
DB: mysql+mysqldb://airflow:***@localhost:3306/airflow
[2019-07-11 12:01:14,276] {db.py:350} INFO - Creating tables
INFO  [alembic.runtime.migration] Context impl MySQLImpl.
INFO  [alembic.runtime.migration] Will assume non-transactional DDL.
INFO  [alembic.runtime.migration] Running upgrade  -> e3a246e0dc1, 
current schema
INFO  [alembic.runtime.migration] Running upgrade e3a246e0dc1 -> 
1507a7289a2f, create is_encrypted
INFO  [alembic.runtime.migration] Running upgrade 1507a7289a2f -> 
13eb55f81627, maintain history for compatibility with earlier 
migrations
INFO  [alembic.runtime.migration] Running upgrade 13eb55f81627 -> 
338e90f54d61, More logging into task_instance
INFO  [alembic.runtime.migration] Running upgrade 338e90f54d61 -> 
52d714495f0, job_id indices
INFO  [alembic.runtime.migration] Running upgrade 52d714495f0 -> 
502898887f84, Adding extra to Log
INFO  [alembic.runtime.migration] Running upgrade 502898887f84 -> 
1b38cef5b76e, add dagrun
INFO  [alembic.runtime.migration] Running upgrade 1b38cef5b76e -> 
2e541a1dcfed, task_duration
INFO  [alembic.runtime.migration] Running upgrade 2e541a1dcfed -> 
40e67319e3a9, dagrun_config
INFO  [alembic.runtime.migration] Running upgrade 40e67319e3a9 -> 
561833c1c74b, add password column to user
INFO  [alembic.runtime.migration] Running upgrade 561833c1c74b -> 
4446e08588, dagrun start end
INFO  [alembic.runtime.migration] Running upgrade 4446e08588 -> 
bbc73705a13e, Add notification_sent column to sla_miss
INFO  [alembic.runtime.migration] Running upgrade bbc73705a13e -> 
bba5a7cfc896, Add a column to track the encryption state of the 
'Extra' field in connection
INFO  [alembic.runtime.migration] Running upgrade bba5a7cfc896 -> 
1968acfc09e3, add is_encrypted column to variable table
INFO  [alembic.runtime.migration] Running upgrade 1968acfc09e3 -> 
2e82aab8ef20, rename user table
INFO  [alembic.runtime.migration] Running upgrade 2e82aab8ef20 -> 
211e584da130, add TI state index
INFO  [alembic.runtime.migration] Running upgrade 211e584da130 -> 
64de9cddf6c9, add task fails journal table
INFO  [alembic.runtime.migration] Running upgrade 64de9cddf6c9 -> 
f2ca10b85618, add dag_stats table
INFO  [alembic.runtime.migration] Running upgrade f2ca10b85618 -> 
4addfa1236f1, Add fractional seconds to mysql tables
INFO  [alembic.runtime.migration] Running upgrade 4addfa1236f1 -> 
8504051e801b, xcom dag task indices
INFO  [alembic.runtime.migration] Running upgrade 8504051e801b -> 
5e7d17757c7a, add pid field to TaskInstance
INFO  [alembic.runtime.migration] Running upgrade 5e7d17757c7a -> 
127d2bf2dfa7, Add dag_id/state index on dag_run table
INFO  [alembic.runtime.migration] Running upgrade 127d2bf2dfa7 -> 
cc1e65623dc7, add max tries column to task instance
INFO  [alembic.runtime.migration] Running upgrade cc1e65623dc7 -> 
bdaa763e6c56, Make xcom value column a large binary
INFO  [alembic.runtime.migration] Running upgrade bdaa763e6c56 -> 
947454bf1dff, add ti job_id index
INFO  [alembic.runtime.migration] Running upgrade 947454bf1dff -> 
d2ae31099d61, Increase text size for MySQL (not relevant for other 
DBs' text types)
INFO  [alembic.runtime.migration] Running upgrade d2ae31099d61 -> 
0e2a74e0fc9f, Add time zone awareness
INFO  [alembic.runtime.migration] Running upgrade d2ae31099d61 -> 
33ae817a1ff4, kubernetes_resource_checkpointing
Traceback (most recent call last):
File "/usr/local/lib/python3.6/dist- 
packages/sqlalchemy/engine/base.py", line 1236, in _execute_context
cursor, statement, parameters, context


File "/usr/local/lib/python3.6/dist-packages/sqlalchemy/engine/default.py", line 536, in do_execute
    cursor.execute(statement, parameters)
  File "/usr/local/lib/python3.6/dist-packages/MySQLdb/cursors.py", line 255, in execute
    self.errorhandler(self, exc, value)
  File "/usr/local/lib/python3.6/dist-packages/MySQLdb/connections.py", line 50, in defaulterrorhandler
    raise errorvalue
  File "/usr/local/lib/python3.6/dist-packages/MySQLdb/cursors.py", line 252, in execute
    res = self._query(query)
  File "/usr/local/lib/python3.6/dist-packages/MySQLdb/cursors.py", line 378, in _query
    db.query(q)
  File "/usr/local/lib/python3.6/dist-packages/MySQLdb/connections.py", line 280, in query
    _mysql.connection.query(self, query)
_mysql_exceptions.OperationalError: (3812, "An expression of non-boolean type specified to a check constraint 'kube_resource_version_one_row_id'.")

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/home/shahbaz/.local/bin/airflow", line 32, in <module>
    args.func(args)
  File "/usr/local/lib/python3.6/dist-packages/airflow/bin/cli.py", line 1096, in initdb
    db.initdb(settings.RBAC)
  File "/usr/local/lib/python3.6/dist-packages/airflow/utils/db.py", line 91, in initdb
    upgradedb()
  File "/usr/local/lib/python3.6/dist-packages/airflow/utils/db.py", line 358, in upgradedb
    command.upgrade(config, 'heads')
  File "/usr/local/lib/python3.6/dist-packages/alembic/command.py", line 254, in upgrade
    script.run_env()
  File "/usr/local/lib/python3.6/dist-packages/alembic/script/base.py", line 427, in run_env
    util.load_python_file(self.dir, 'env.py')
  File "/usr/local/lib/python3.6/dist-packages/alembic/util/pyfiles.py", line 81, in load_python_file
    module = load_module_py(module_id, path)
  File "/usr/local/lib/python3.6/dist-packages/alembic/util/compat.py", line 83, in load_module_py
    spec.loader.exec_module(module)
  File "<frozen importlib._bootstrap_external>", line 678, in exec_module
  File "<frozen importlib._bootstrap>", line 219, in _call_with_frames_removed
  File "/usr/local/lib/python3.6/dist-packages/airflow/migrations/env.py", line 92, in <module>
    run_migrations_online()
  File "/usr/local/lib/python3.6/dist-packages/airflow/migrations/env.py", line 86, in run_migrations_online
    context.run_migrations()
  File "<string>", line 8, in run_migrations
  File "/usr/local/lib/python3.6/dist-packages/alembic/runtime/environment.py", line 836, in run_migrations
    self.get_context().run_migrations(**kw)
  File "/usr/local/lib/python3.6/dist-packages/alembic/runtime/migration.py", line 330, in run_migrations
    step.migration_fn(**kw)
  File "/usr/local/lib/python3.6/dist-packages/airflow/migrations/versions/33ae817a1ff4_add_kubernetes_resource_checkpointing.py", line 55, in upgrade
    *columns_and_constraints
  File "<string>", line 8, in create_table
  File "<string>", line 3, in create_table
  File "/usr/local/lib/python3.6/dist-packages/alembic/operations/ops.py", line 1120, in create_table
    return operations.invoke(op)
  File "/usr/local/lib/python3.6/dist-packages/alembic/operations/base.py", line 319, in invoke
    return fn(self, operation)
  File "/usr/local/lib/python3.6/dist-packages/alembic/operations/toimpl.py", line 101, in create_table
    operations.impl.create_table(table)
  File "/usr/local/lib/python3.6/dist-packages/alembic/ddl/impl.py", line 194, in create_table
    self._exec(schema.CreateTable(table))
  File "/usr/local/lib/python3.6/dist-packages/alembic/ddl/impl.py", line 118, in _exec
    return conn.execute(construct, *multiparams, **params)
  File "/usr/local/lib/python3.6/dist-packages/sqlalchemy/engine/base.py", line 980, in execute
    return meth(self, multiparams, params)
  File "/usr/local/lib/python3.6/dist-packages/sqlalchemy/sql/ddl.py", line 72, in _execute_on_connection
    return connection._execute_ddl(self, multiparams, params)
  File "/usr/local/lib/python3.6/dist-packages/sqlalchemy/engine/base.py", line 1042, in _execute_ddl
    compiled,
  File "/usr/local/lib/python3.6/dist-packages/sqlalchemy/engine/base.py", line 1240, in _execute_context
    e, statement, parameters, cursor, context
  File "/usr/local/lib/python3.6/dist-packages/sqlalchemy/engine/base.py", line 1458, in _handle_dbapi_exception
    util.raise_from_cause(sqlalchemy_exception, exc_info)
  File "/usr/local/lib/python3.6/dist-packages/sqlalchemy/util/compat.py", line 296, in raise_from_cause
    reraise(type(exception), exception, tb=exc_tb, cause=cause)
  File "/usr/local/lib/python3.6/dist-packages/sqlalchemy/util/compat.py", line 276, in reraise
    raise value.with_traceback(tb)
  File "/usr/local/lib/python3.6/dist-packages/sqlalchemy/engine/base.py", line 1236, in _execute_context
    cursor, statement, parameters, context
  File "/usr/local/lib/python3.6/dist-packages/sqlalchemy/engine/default.py", line 536, in do_execute
    cursor.execute(statement, parameters)
  File "/usr/local/lib/python3.6/dist-packages/MySQLdb/cursors.py", line 255, in execute
    self.errorhandler(self, exc, value)
  File "/usr/local/lib/python3.6/dist-packages/MySQLdb/connections.py", line 50, in defaulterrorhandler
    raise errorvalue
  File "/usr/local/lib/python3.6/dist-packages/MySQLdb/cursors.py", line 252, in execute
    res = self._query(query)
  File "/usr/local/lib/python3.6/dist-packages/MySQLdb/cursors.py", line 378, in _query
    db.query(q)
  File "/usr/local/lib/python3.6/dist-packages/MySQLdb/connections.py", line 280, in query
    _mysql.connection.query(self, query)
sqlalchemy.exc.OperationalError: (_mysql_exceptions.OperationalError) (3812, "An expression of non-boolean type specified to a check constraint 'kube_resource_version_one_row_id'.") [SQL: '\nCREATE TABLE kube_resource_version (\n\tone_row_id BOOL NOT NULL DEFAULT true, \n\tresource_version VARCHAR(255), \n\tPRIMARY KEY (one_row_id), \n\tCONSTRAINT kube_resource_version_one_row_id CHECK (one_row_id), \n\tCHECK (one_row_id IN (0, 1))\n)\n\n'] (Background on this error at: http://sqlalche.me/e/e3q8)

您可以看到initdb命令对于kubernetes_resource_checkpointing失败

和最后的日志记录表明这是由于sqlalchemy中的OperationalError。

sqlalchemy.exc.OperationalError: (_mysql_exceptions.OperationalError) 
(3812, "An expression of non-boolean type specified to a check 
constraint 'kube_resource_version_one_row_id'.") [SQL: '\nCREATE TABLE 
kube_resource_version (\n\tone_row_id BOOL NOT NULL DEFAULT true, 
\n\tresource_version VARCHAR(255), \n\tPRIMARY KEY (one_row_id), 
\n\tCONSTRAINT kube_resource_version_one_row_id CHECK (one_row_id), 
\n\tCHECK (one_row_id IN (0, 1))\n)\n\n'] (Background on this error 
at: http://sqlalche.me/e/e3q8)

我想在这里说明,我能够使用Postgres数据库运行apache-airflow,而我将气流用于Postgres仅仅是因为它对MySQL的作用很奇怪。

我正在使用

apache-airflow版本1.10.3

mysql版本8.0.16(MySQL社区服务器-GPL)

我也尝试过按照气流文档的说明,为MySQL设置带有'ANSI'的MYSQL的SQL_MODE,但这一切都是徒劳的。

任何帮助将不胜感激

[编辑]

感谢'skadya'指出问题链接 让我分享我发现的东西 我检查了“史辰”指出的代码文件 两个文件负责此行为。

33ae817a1ff4_add_kubernetes_resource_checkpointing.py
86770d1215c0_add_kubernetes_scheduler_uniqueness.py

这两个文件都是使用Alembic和sqlalchemy库的迁移文件 我发现以下写在文件33ae817a1ff4_add_kubernetes_resource_checkpointing.py中的sqlalchemy代码

def upgrade():

    columns_and_constraints = [
        sa.Column("one_row_id", sa.Boolean, server_default=sa.true(), primary_key=True),
        sa.Column("resource_version", sa.String(255))
    ]

    conn = op.get_bind()

    # alembic creates an invalid SQL for mssql dialect
    if conn.dialect.name not in ('mssql'):
        columns_and_constraints.append(sa.CheckConstraint("one_row_id", name="kube_resource_version_one_row_id"))

    table = op.create_table(
        RESOURCE_TABLE,
        *columns_and_constraints
    )

    op.bulk_insert(table, [
        {"resource_version": ""}
    ])

解释为以下不正确的SQL查询

CREATE TABLE 
kube_resource_version (one_row_id BOOL NOT NULL DEFAULT true, 
resource_version VARCHAR(255), PRIMARY KEY (one_row_id), 
CONSTRAINT kube_resource_version_one_row_id CHECK (one_row_id), 
CHECK (one_row_id IN (0, 1))

相反,SQL查询应该是这样的

CREATE TABLE 
kube_resource_version (one_row_id BOOL NOT NULL DEFAULT true, 
resource_version VARCHAR(255), PRIMARY KEY (one_row_id), 
CONSTRAINT kube_resource_version_one_row_id CHECK (one_row_id IN (0, 
1)))

'skadya'提供的链接对我更改了上述两个文件的代码后,使系统正常工作。

您只需从

更改以下代码
if conn.dialect.name not in ('mssql'):
        columns_and_constraints.append(
            sa.CheckConstraint("one_row_id", name="kube_resource_version_one_row_id")
        )

if conn.dialect.name not in ('mssql', 'mysql'):
    columns_and_constraints.append(
        sa.CheckConstraint("one_row_id", name="kube_resource_version_one_row_id")
    )

3 个答案:

答案 0 :(得分:2)

气流缺陷跟踪器中有一个未解决的缺陷。

https://issues.apache.org/jira/browse/AIRFLOW-4995

作为解决方法,您可以在pull request中手动应用建议的更改。

答案 1 :(得分:1)

我遇到了完全相同的问题。有人知道该怎么办吗?

顺便说一句,我遇到了另一个问题,抱怨重置数据库时dag_stats表已经存在。我必须手动删除dag_stats才能使重置完成该步骤。但仍受此约束。

Map<Long, List<Long>> cellAtributesMap = new HashMap();

// note, the following line is not required and should be removed
//List<Long> cellList = new ArrayList<>();

for(TargetedOffersCampaignVO targetedOffersCampaignVO: targetedOffersCampaignVOList){
    Long campaignId = targetedOffersCampaignVO.getCampaignID();
    Long cellUserNumber = targetedOffersCampaignVO.getCellUserNumber();
    if(cellAtributesMap.containsKey(campaignId)){
        // the list in the value already exists anyway, just add a new cell user number to it 
        cellAttributesMap.get(campaignId).add(cellUserNumber);
    }
    else {
        // create a new key-value pair in the result map
        // and add one element which is a current cellUserNumber to it
        List<Long> cellList = new ArrayList<>();
        cellList.add(cellUserNumber);
        cellAtributesMap.put(campaignId, cellList);
    }
}
return cellAtributesMap;

答案 2 :(得分:1)

您只需在这些文件中更改以下代码

33ae817a1ff4_add_kubernetes_resource_checkpointing.py 86770d1215c0_add_kubernetes_scheduler_uniqueness.py

来自

if conn.dialect.name not in ('mssql'):
        columns_and_constraints.append(
            sa.CheckConstraint("one_row_id", 
name="kube_resource_version_one_row_id")
        )

if conn.dialect.name not in ('mssql', 'mysql'):
    columns_and_constraints.append(
        sa.CheckConstraint("one_row_id", 
name="kube_resource_version_one_row_id")
    )