矢量更新的flask-SQLAlchemy模型

时间:2018-03-20 11:21:25

标签: python pandas flask sqlalchemy flask-sqlalchemy

使用flask-sqlalchemy,我希望有一个带有pandas数据帧的函数,其中每一行代表一个模型实例,每一列对应一个模型的属性,并更新相应的模型实例。数据库。我已经建立了它,但我正在寻找一种方法来矢量化它。我的主要问题是我无法编写向量化的SQLAlchemy更新查询。

这可能吗?如果是,我该怎么做?

这是我到目前为止所做的:

def update_model_instances_from_df(model, input_df, match_columns=['id'],
                                   update_columns=[]):
    """
    Update the columns *update_columns* of instances of a given *model*,
    provided in an *input_df*, identified by the *match_columns*. If no
    *match_columns* are provided, it is assumed that an 'id' column exists in
    *input_df* which will be used to map the rows of the input_df to the model
    table. If no *update_columns* are provided, all columns present in the
    *input_df* which are not *match_columns* are updated.

    It is assumed that the number of *match_columns* is a unique identifier for
    each model instance, i.e. that no two model instance exist, which have the
    same value combination in the *match_columns*. Violating this rule might
    lead to unexpected behaviour.

    :param model: db.Model class.
    :param input_df: DataFrame, each row is representing one model instance
    to update.
    :param match_columns: list of Strings, list of column headers to identify
    the model instances in the database. If not provided, ['id'] is assumed.
    :param update_columns: list of Strings, list of headers of columns to
    update the identified model instances on. If not provided, all columns of
    the input_df which are also columns of the model, will be used.

    :return:
    """

    match_columns = set(match_columns)
    update_columns = set(update_columns)
    df_columns = set(input_df.columns)
    model_columns = set(inspect(model).columns.keys())
    intersect_columns = df_columns & model_columns
    assert match_columns != set()
    assert match_columns <= intersect_columns
    assert update_columns <= intersect_columns

    # If no update_columns are specified, update all columns which are not
    # match_columns
    if update_columns == set():
        update_columns = intersect_columns - match_columns

    # Would be nice to have a vectorized way of doing this ...
    for index, row in input_df.iterrows():
        # ... Especially, it is annoying to assemble the query in every row
        # again. Would be better to define the query in an abstract way and
        # just apply the concrete values per row
        query = db.session.query(model)
        for key in match_columns:
            query = query.filter(
                getattr(model, key) == row[key]
            )

        update_values = {}
        for key in update_columns:
            update_values[key] = row[key]

        query.update(update_values)
    db.session.commit()

0 个答案:

没有答案