我们正在使用Apache Airflow在诸如Postgres和Redshift之类的数据库中运行DML。
现在,我们正在使用Postgres hook,它实现了psycopg2。
运行COPY
命令或Delete
,Insert
,Update
之类的DML时,我没有在气流日志中得到受影响的记录数。通过psql
命令行工具执行它们时,将它们作为输出。
下面是在我的代码库中执行SQL的代码:
def execute(self, context):
hook = PostgresHook(postgres_conn_id=self.postgres_conn_id, schema=self.schema)
if self.pg_preoperator:
logging.info("Setting up Postgres operator.")
hook.run(self.pg_preoperator)
logging.info('Executing: ' + str(self.sql))
logging.info('Parameters: ' + str(self.parameters))
hook.run(self.sql, self.autocommit, parameters=self.parameters)
if self.pg_postoperator:
logging.info("Finished Postgres query.")
hook.run(self.pg_postoperator)
我应该如何修改代码以获取气流日志中受影响记录的输出?我认为Postgres挂钩代码中没有任何变量可以捕获cursor.execute
的输出。
此外,下面是运行命令,最终被称为:
def run(self, sql, autocommit=False, parameters=None):
"""
Runs a command or a list of commands. Pass a list of sql
statements to the sql parameter to get them to execute
sequentially
:param sql: the sql statement to be executed (str) or a list of
sql statements to execute
:type sql: str or list
:param autocommit: What to set the connection's autocommit setting to
before executing the query.
:type autocommit: bool
:param parameters: The parameters to render the SQL query with.
:type parameters: mapping or iterable
"""
if isinstance(sql, str):
sql = [sql]
with closing(self.get_conn()) as conn:
if self.supports_autocommit:
self.set_autocommit(conn, autocommit)
with closing(conn.cursor()) as cur:
for s in sql:
if parameters is not None:
self.log.info("{} with parameters {}".format(s, parameters))
cur.execute(s, parameters)
else:
self.log.info(s)
cur.execute(s)
# If autocommit was set to False for db that supports autocommit,
# or if db does not supports autocommit, we do a manual commit.
if not self.get_autocommit(conn):
conn.commit()