使用flask-sqlalchemy,我希望有一个带有pandas数据帧的函数,其中每一行代表一个模型实例,每一列对应一个模型的属性,并更新相应的模型实例。数据库。我已经建立了它,但我正在寻找一种方法来矢量化它。我的主要问题是我无法编写向量化的SQLAlchemy更新查询。
这可能吗?如果是,我该怎么做?
这是我到目前为止所做的:
def update_model_instances_from_df(model, input_df, match_columns=['id'],
update_columns=[]):
"""
Update the columns *update_columns* of instances of a given *model*,
provided in an *input_df*, identified by the *match_columns*. If no
*match_columns* are provided, it is assumed that an 'id' column exists in
*input_df* which will be used to map the rows of the input_df to the model
table. If no *update_columns* are provided, all columns present in the
*input_df* which are not *match_columns* are updated.
It is assumed that the number of *match_columns* is a unique identifier for
each model instance, i.e. that no two model instance exist, which have the
same value combination in the *match_columns*. Violating this rule might
lead to unexpected behaviour.
:param model: db.Model class.
:param input_df: DataFrame, each row is representing one model instance
to update.
:param match_columns: list of Strings, list of column headers to identify
the model instances in the database. If not provided, ['id'] is assumed.
:param update_columns: list of Strings, list of headers of columns to
update the identified model instances on. If not provided, all columns of
the input_df which are also columns of the model, will be used.
:return:
"""
match_columns = set(match_columns)
update_columns = set(update_columns)
df_columns = set(input_df.columns)
model_columns = set(inspect(model).columns.keys())
intersect_columns = df_columns & model_columns
assert match_columns != set()
assert match_columns <= intersect_columns
assert update_columns <= intersect_columns
# If no update_columns are specified, update all columns which are not
# match_columns
if update_columns == set():
update_columns = intersect_columns - match_columns
# Would be nice to have a vectorized way of doing this ...
for index, row in input_df.iterrows():
# ... Especially, it is annoying to assemble the query in every row
# again. Would be better to define the query in an abstract way and
# just apply the concrete values per row
query = db.session.query(model)
for key in match_columns:
query = query.filter(
getattr(model, key) == row[key]
)
update_values = {}
for key in update_columns:
update_values[key] = row[key]
query.update(update_values)
db.session.commit()